SNAP Library 2.3, Developer Reference  2014-06-16 11:58:46
SNAP, a general purpose, high performance system for analysis and manipulation of large networks
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros
TAGMFast Class Reference

Community detection with AGM. Sparse AGM-fast with coordinate ascent. More...

#include <agmfast.h>

Collaboration diagram for TAGMFast:

Public Member Functions

 TAGMFast (const PUNGraph &GraphPt, const int &InitComs, const int RndSeed=0)
 
void SetGraph (const PUNGraph &GraphPt)
 
void SetRegCoef (const double _RegCoef)
 
double GetRegCoef ()
 
void RandomInit (const int InitComs)
 
void NeighborComInit (const int InitComs)
 
void SetCmtyVV (const TVec< TIntV > &CmtyVV)
 
double Likelihood (const bool DoParallel=false)
 
double LikelihoodForRow (const int UID)
 
double LikelihoodForRow (const int UID, const TIntFltH &FU)
 
int MLENewton (const double &Thres, const int &MaxIter, const TStr &PlotNm=TStr())
 Newton method: DEPRECATED. More...
 
void GradientForRow (const int UID, TIntFltH &GradU, const TIntSet &CIDSet)
 
double GradientForOneVar (const TFltV &AlphaKV, const int UID, const int CID, const double &Val)
 
double HessianForOneVar (const TFltV &AlphaKV, const int UID, const int CID, const double &Val)
 
double LikelihoodForOneVar (const TFltV &AlphaKV, const int UID, const int CID, const double &Val)
 
void GetCmtyVV (TVec< TIntV > &CmtyVV)
 
void GetCmtyVV (TVec< TIntV > &CmtyVV, const double Thres, const int MinSz=3)
 extract community affiliation from F_uc More...
 
int FindComsByCV (TIntV &ComsV, const double HOFrac=0.2, const int NumThreads=20, const TStr &PlotLFNm=TStr(), const double StepAlpha=0.3, const double StepBeta=0.1)
 
int FindComsByCV (const int NumThreads, const int MaxComs, const int MinComs, const int DivComs, const TStr &OutFNm, const double StepAlpha=0.3, const double StepBeta=0.3)
 estimate number of communities using cross validation More...
 
double LikelihoodHoldOut (const bool DoParallel=false)
 
double GetStepSizeByLineSearch (const int UID, const TIntFltH &DeltaV, const TIntFltH &GradV, const double &Alpha, const double &Beta, const int MaxIter=10)
 
int MLEGradAscent (const double &Thres, const int &MaxIter, const TStr &PlotNm, const double StepAlpha=0.3, const double StepBeta=0.1)
 
int MLEGradAscentParallel (const double &Thres, const int &MaxIter, const int ChunkNum, const int ChunkSize, const TStr &PlotNm, const double StepAlpha=0.3, const double StepBeta=0.1)
 
int MLEGradAscentParallel (const double &Thres, const int &MaxIter, const int ChunkNum, const TStr &PlotNm=TStr(), const double StepAlpha=0.3, const double StepBeta=0.1)
 
void Save (TSOut &SOut)
 
void Load (TSIn &SIn, const int &RndSeed=0)
 
double GetCom (const int &NID, const int &CID)
 
void AddCom (const int &NID, const int &CID, const double &Val)
 
void DelCom (const int &NID, const int &CID)
 
double DotProduct (const TIntFltH &UV, const TIntFltH &VV)
 
double DotProduct (const int &UID, const int &VID)
 
double Prediction (const TIntFltH &FU, const TIntFltH &FV)
 
double Prediction (const int &UID, const int &VID)
 
double Sum (const TIntFltH &UV)
 
double Norm2 (const TIntFltH &UV)
 

Public Attributes

TVec< TIntSetHOVIDSV
 
TFlt MinVal
 
TFlt MaxVal
 
TFlt NegWgt
 
TFlt PNoCom
 
TBool DoParallel
 

Private Attributes

PUNGraph G
 
TVec< TIntFltHF
 
TRnd Rnd
 
TIntV NIDV
 
TFlt RegCoef
 
TFltV SumFV
 
TBool NodesOk
 
TInt NumComs
 

Detailed Description

Community detection with AGM. Sparse AGM-fast with coordinate ascent.

Definition at line 7 of file agmfast.h.

Constructor & Destructor Documentation

TAGMFast::TAGMFast ( const PUNGraph GraphPt,
const int &  InitComs,
const int  RndSeed = 0 
)
inline

Definition at line 25 of file agmfast.h.

References RandomInit(), and SetGraph().

25  : Rnd(RndSeed), RegCoef(0),
26  NodesOk(true), MinVal(0.0), MaxVal(1000.0), NegWgt(1.0) { SetGraph(GraphPt); RandomInit(InitComs); }
TFlt MinVal
Definition: agmfast.h:19
TFlt NegWgt
Definition: agmfast.h:21
TBool NodesOk
Definition: agmfast.h:15
void SetGraph(const PUNGraph &GraphPt)
Definition: agmfast.cpp:146
TFlt RegCoef
Definition: agmfast.h:13
TRnd Rnd
Definition: agmfast.h:11
TFlt MaxVal
Definition: agmfast.h:20
void RandomInit(const int InitComs)
Definition: agmfast.cpp:38

Here is the call graph for this function:

Member Function Documentation

void TAGMFast::AddCom ( const int &  NID,
const int &  CID,
const double &  Val 
)
inline

Definition at line 64 of file agmfast.h.

References F, and SumFV.

Referenced by MLEGradAscent(), MLENewton(), NeighborComInit(), RandomInit(), and SetCmtyVV().

64  {
65  if (F[NID].IsKey(CID)) {
66  SumFV[CID] -= F[NID].GetDat(CID);
67  }
68  F[NID].AddDat(CID) = Val;
69  SumFV[CID] += Val;
70  }
TVec< TIntFltH > F
Definition: agmfast.h:10
TFltV SumFV
Definition: agmfast.h:14

Here is the caller graph for this function:

void TAGMFast::DelCom ( const int &  NID,
const int &  CID 
)
inline

Definition at line 72 of file agmfast.h.

References F, and SumFV.

Referenced by MLEGradAscent(), and MLENewton().

72  {
73  if (F[NID].IsKey(CID)) {
74  SumFV[CID] -= F[NID].GetDat(CID);
75  F[NID].DelKey(CID);
76  }
77  }
TVec< TIntFltH > F
Definition: agmfast.h:10
TFltV SumFV
Definition: agmfast.h:14

Here is the caller graph for this function:

double TAGMFast::DotProduct ( const TIntFltH UV,
const TIntFltH VV 
)
inline

Definition at line 78 of file agmfast.h.

References THash< TKey, TDat, THashFunc >::BegI(), THash< TKey, TDat, THashFunc >::EndI(), THash< TKey, TDat, THashFunc >::GetDat(), THash< TKey, TDat, THashFunc >::IsKey(), and THash< TKey, TDat, THashFunc >::Len().

Referenced by DotProduct(), GetStepSizeByLineSearch(), LikelihoodForRow(), MLENewton(), and Prediction().

78  {
79  double DP = 0;
80  if (UV.Len() > VV.Len()) {
81  for (TIntFltH::TIter HI = UV.BegI(); HI < UV.EndI(); HI++) {
82  if (VV.IsKey(HI.GetKey())) {
83  DP += VV.GetDat(HI.GetKey()) * HI.GetDat();
84  }
85  }
86  } else {
87  for (TIntFltH::TIter HI = VV.BegI(); HI < VV.EndI(); HI++) {
88  if (UV.IsKey(HI.GetKey())) {
89  DP += UV.GetDat(HI.GetKey()) * HI.GetDat();
90  }
91  }
92  }
93  return DP;
94  }
TIter BegI() const
Definition: hash.h:171
const TDat & GetDat(const TKey &Key) const
Definition: hash.h:220
TIter EndI() const
Definition: hash.h:176
bool IsKey(const TKey &Key) const
Definition: hash.h:216
int Len() const
Definition: hash.h:186

Here is the call graph for this function:

Here is the caller graph for this function:

double TAGMFast::DotProduct ( const int &  UID,
const int &  VID 
)
inline

Definition at line 95 of file agmfast.h.

References DotProduct(), and F.

95  {
96  return DotProduct(F[UID], F[VID]);
97  }
TVec< TIntFltH > F
Definition: agmfast.h:10
double DotProduct(const TIntFltH &UV, const TIntFltH &VV)
Definition: agmfast.h:78

Here is the call graph for this function:

int TAGMFast::FindComsByCV ( TIntV ComsV,
const double  HOFrac = 0.2,
const int  NumThreads = 20,
const TStr PlotLFNm = TStr(),
const double  StepAlpha = 0.3,
const double  StepBeta = 0.1 
)

Definition at line 554 of file agmfast.cpp.

References TVec< TVal, TSizeTy >::Add(), TUNGraph::BegEI(), TStr::Empty(), TUNGraph::EndEI(), G, TVec< TVal, TSizeTy >::Gen(), TUNGraph::GetEdges(), TUNGraph::GetNIdV(), TUNGraph::GetNodes(), TRnd::GetUniDevInt(), HOVIDSV, TVec< TVal, TSizeTy >::Last(), TVec< TVal, TSizeTy >::Len(), Likelihood(), LikelihoodHoldOut(), MLEGradAscent(), MLEGradAscentParallel(), TFlt::Mn, NeighborComInit(), TGnuPlot::PlotValV(), RandomInit(), Rnd, and TMath::Round().

Referenced by FindComsByCV().

554  {
555  if (ComsV.Len() == 0) {
556  int MaxComs = G->GetNodes() / 5;
557  ComsV.Add(2);
558  while(ComsV.Last() < MaxComs) { ComsV.Add(ComsV.Last() * 2); }
559  }
560  TIntPrV EdgeV(G->GetEdges(), 0);
561  for (TUNGraph::TEdgeI EI = G->BegEI(); EI < G->EndEI(); EI++) {
562  EdgeV.Add(TIntPr(EI.GetSrcNId(), EI.GetDstNId()));
563  }
564  EdgeV.Shuffle(Rnd);
565  int MaxIterCV = 3;
566 
567  TVec<TVec<TIntSet> > HoldOutSets(MaxIterCV);
568  if (EdgeV.Len() > 50) { //if edges are many enough, use CV
569  printf("generating hold out set\n");
570  TIntV NIdV1, NIdV2;
571  G->GetNIdV(NIdV1);
572  G->GetNIdV(NIdV2);
573  for (int IterCV = 0; IterCV < MaxIterCV; IterCV++) {
574  // generate holdout sets
575  HoldOutSets[IterCV].Gen(G->GetNodes());
576  const int HOTotal = int(HOFrac * G->GetNodes() * (G->GetNodes() - 1) / 2.0);
577  int HOCnt = 0;
578  int HOEdges = (int) TMath::Round(HOFrac * G->GetEdges());
579  printf("holding out %d edges...\n", HOEdges);
580  for (int he = 0; he < (int) HOEdges; he++) {
581  HoldOutSets[IterCV][EdgeV[he].Val1].AddKey(EdgeV[he].Val2);
582  HoldOutSets[IterCV][EdgeV[he].Val2].AddKey(EdgeV[he].Val1);
583  HOCnt++;
584  }
585  printf("%d Edges hold out\n", HOCnt);
586  while(HOCnt++ < HOTotal) {
587  int SrcNID = Rnd.GetUniDevInt(G->GetNodes());
588  int DstNID = Rnd.GetUniDevInt(G->GetNodes());
589  HoldOutSets[IterCV][SrcNID].AddKey(DstNID);
590  HoldOutSets[IterCV][DstNID].AddKey(SrcNID);
591  }
592  }
593  printf("hold out set generated\n");
594  }
595 
596  TFltV HOLV(ComsV.Len());
597  TIntFltPrV ComsLV;
598  for (int c = 0; c < ComsV.Len(); c++) {
599  const int Coms = ComsV[c];
600  printf("Try number of Coms:%d\n", Coms);
601  NeighborComInit(Coms);
602  printf("Initialized\n");
603 
604  if (EdgeV.Len() > 50) { //if edges are many enough, use CV
605  for (int IterCV = 0; IterCV < MaxIterCV; IterCV++) {
606  HOVIDSV = HoldOutSets[IterCV];
607 
608  if (NumThreads == 1) {
609  printf("MLE without parallelization begins\n");
610  MLEGradAscent(0.05, 10 * G->GetNodes(), "", StepAlpha, StepBeta);
611  } else {
612  printf("MLE with parallelization begins\n");
613  MLEGradAscentParallel(0.05, 100, NumThreads, "", StepAlpha, StepBeta);
614  }
615  double HOL = LikelihoodHoldOut();
616  HOL = HOL < 0? HOL: TFlt::Mn;
617  HOLV[c] += HOL;
618  }
619  }
620  else {
621  HOVIDSV.Gen(G->GetNodes());
622  MLEGradAscent(0.0001, 100 * G->GetNodes(), "");
623  double BIC = 2 * Likelihood() - (double) G->GetNodes() * Coms * 2.0 * log ( (double) G->GetNodes());
624  HOLV[c] = BIC;
625  }
626  }
627  int EstComs = 2;
628  double MaxL = TFlt::Mn;
629  printf("\n");
630  for (int c = 0; c < ComsV.Len(); c++) {
631  ComsLV.Add(TIntFltPr(ComsV[c].Val, HOLV[c].Val));
632  printf("%d(%f)\t", ComsV[c].Val, HOLV[c].Val);
633  if (MaxL < HOLV[c]) {
634  MaxL = HOLV[c];
635  EstComs = ComsV[c];
636  }
637  }
638  printf("\n");
639  RandomInit(EstComs);
640  HOVIDSV.Gen(G->GetNodes());
641  if (! PlotLFNm.Empty()) {
642  TGnuPlot::PlotValV(ComsLV, PlotLFNm, "hold-out likelihood", "communities", "likelihood");
643  }
644  return EstComs;
645 }
Edge iterator. Only forward iteration (operator++) is supported.
Definition: graph.h:111
TPair< TInt, TInt > TIntPr
Definition: ds.h:83
PUNGraph G
Definition: agmfast.h:9
TEdgeI EndEI() const
Returns an iterator referring to the past-the-end edge in the graph.
Definition: graph.h:229
int MLEGradAscentParallel(const double &Thres, const int &MaxIter, const int ChunkNum, const int ChunkSize, const TStr &PlotNm, const double StepAlpha=0.3, const double StepBeta=0.1)
Definition: agmfast.cpp:761
double LikelihoodHoldOut(const bool DoParallel=false)
Definition: agmfast.cpp:647
void NeighborComInit(const int InitComs)
Definition: agmfast.cpp:60
void GetNIdV(TIntV &NIdV) const
Gets a vector IDs of all nodes in the graph.
Definition: graph.cpp:125
int GetEdges() const
Returns the number of edges in the graph.
Definition: graph.cpp:74
TSizeTy Len() const
Returns the number of elements in the vector.
Definition: ds.h:535
int MLEGradAscent(const double &Thres, const int &MaxIter, const TStr &PlotNm, const double StepAlpha=0.3, const double StepBeta=0.1)
Definition: agmfast.cpp:690
int GetNodes() const
Returns the number of nodes in the graph.
Definition: graph.h:162
TPair< TInt, TFlt > TIntFltPr
Definition: ds.h:87
static double Round(const double &Val)
Definition: xmath.h:16
TEdgeI BegEI() const
Returns an iterator referring to the first edge in the graph.
Definition: graph.h:227
const TVal & Last() const
Returns a reference to the last element of the vector.
Definition: ds.h:539
TVec< TIntSet > HOVIDSV
Definition: agmfast.h:18
bool Empty() const
Definition: dt.h:488
static void PlotValV(const TVec< TPair< TVal1, TVal2 > > &ValV, const TStr &OutFNmPref, const TStr &Desc="", const TStr &XLabel="", const TStr &YLabel="", const TGpScaleTy &ScaleTy=gpsAuto, const bool &PowerFit=false, const TGpSeriesTy &SeriesTy=gpwLinesPoints)
Definition: gnuplot.h:363
TRnd Rnd
Definition: agmfast.h:11
double Likelihood(const bool DoParallel=false)
Definition: agmfast.cpp:173
int GetUniDevInt(const int &Range=0)
Definition: dt.cpp:39
void RandomInit(const int InitComs)
Definition: agmfast.cpp:38
TSizeTy Add()
Adds a new element at the end of the vector, after its current last element.
Definition: ds.h:559
static const double Mn
Definition: dt.h:1293
Vector is a sequence TVal objects representing an array that can change in size.
Definition: ds.h:420

Here is the call graph for this function:

Here is the caller graph for this function:

int TAGMFast::FindComsByCV ( const int  NumThreads,
const int  MaxComs,
const int  MinComs,
const int  DivComs,
const TStr OutFNm,
const double  StepAlpha = 0.3,
const double  StepBeta = 0.3 
)

estimate number of communities using cross validation

Definition at line 541 of file agmfast.cpp.

References TVec< TVal, TSizeTy >::Add(), FindComsByCV(), TVec< TVal, TSizeTy >::Last(), TVec< TVal, TSizeTy >::Len(), TMath::Log(), and TInt::Val.

541  {
542  double ComsGap = exp(TMath::Log((double) MaxComs / (double) MinComs) / (double) DivComs);
543  TIntV ComsV;
544  ComsV.Add(MinComs);
545  while (ComsV.Len() < DivComs) {
546  int NewComs = int(ComsV.Last() * ComsGap);
547  if (NewComs == ComsV.Last().Val) { NewComs++; }
548  ComsV.Add(NewComs);
549  }
550  if (ComsV.Last() < MaxComs) { ComsV.Add(MaxComs); }
551  return FindComsByCV(ComsV, 0.1, NumThreads, OutFNm + ".CV.likelihood", StepAlpha, StepBeta);
552 }
int Val
Definition: dt.h:1043
int FindComsByCV(TIntV &ComsV, const double HOFrac=0.2, const int NumThreads=20, const TStr &PlotLFNm=TStr(), const double StepAlpha=0.3, const double StepBeta=0.1)
Definition: agmfast.cpp:554
TSizeTy Len() const
Returns the number of elements in the vector.
Definition: ds.h:535
const TVal & Last() const
Returns a reference to the last element of the vector.
Definition: ds.h:539
static double Log(const double &Val)
Definition: xmath.h:14
TSizeTy Add()
Adds a new element at the end of the vector, after its current last element.
Definition: ds.h:559

Here is the call graph for this function:

void TAGMFast::GetCmtyVV ( TVec< TIntV > &  CmtyVV)

Definition at line 508 of file agmfast.cpp.

References G, TUNGraph::GetEdges(), and TUNGraph::GetNodes().

508  {
509  GetCmtyVV(CmtyVV, sqrt(2.0 * (double) G->GetEdges() / G->GetNodes() / G->GetNodes()), 3);
510 }
PUNGraph G
Definition: agmfast.h:9
int GetEdges() const
Returns the number of edges in the graph.
Definition: graph.cpp:74
int GetNodes() const
Returns the number of nodes in the graph.
Definition: graph.h:162
void GetCmtyVV(TVec< TIntV > &CmtyVV)
Definition: agmfast.cpp:508

Here is the call graph for this function:

void TAGMFast::GetCmtyVV ( TVec< TIntV > &  CmtyVV,
const double  Thres,
const int  MinSz = 3 
)

extract community affiliation from F_uc

Definition at line 513 of file agmfast.cpp.

References TVec< TVal, TSizeTy >::Add(), THash< TKey, TDat, THashFunc >::AddDat(), F, TVec< TVal, TSizeTy >::Gen(), GetCom(), THash< TKey, TDat, THashFunc >::GetDat(), THash< TKey, TDat, THashFunc >::GetKey(), IAssert, TVec< TVal, TSizeTy >::Len(), NIDV, NodesOk, NumComs, THash< TKey, TDat, THashFunc >::SortByDat(), and SumFV.

513  {
514  CmtyVV.Gen(NumComs, 0);
515  TIntFltH CIDSumFH(NumComs);
516  for (int c = 0; c < SumFV.Len(); c++) {
517  CIDSumFH.AddDat(c, SumFV[c]);
518  }
519  CIDSumFH.SortByDat(false);
520  for (int c = 0; c < NumComs; c++) {
521  int CID = CIDSumFH.GetKey(c);
522  TIntFltH NIDFucH(F.Len() / 10);
523  TIntV CmtyV;
524  IAssert(SumFV[CID] == CIDSumFH.GetDat(CID));
525  if (SumFV[CID] < Thres) { continue; }
526  for (int u = 0; u < F.Len(); u++) {
527  int NID = u;
528  if (! NodesOk) { NID = NIDV[u]; }
529  if (GetCom(u, CID) >= Thres) { NIDFucH.AddDat(NID, GetCom(u, CID)); }
530  }
531  NIDFucH.SortByDat(false);
532  NIDFucH.GetKeyV(CmtyV);
533  if (CmtyV.Len() >= MinSz) { CmtyVV.Add(CmtyV); }
534  }
535  if ( NumComs != CmtyVV.Len()) {
536  printf("Community vector generated. %d communities are ommitted\n", NumComs.Val - CmtyVV.Len());
537  }
538 }
#define IAssert(Cond)
Definition: bd.h:262
TSizeTy Len() const
Returns the number of elements in the vector.
Definition: ds.h:535
double GetCom(const int &NID, const int &CID)
Definition: agmfast.h:57
TBool NodesOk
Definition: agmfast.h:15
TVec< TIntFltH > F
Definition: agmfast.h:10
const TVal & GetDat(const TVal &Val) const
Returns reference to the first occurrence of element Val.
Definition: ds.h:792
TInt NumComs
Definition: agmfast.h:16
TFltV SumFV
Definition: agmfast.h:14
TIntV NIDV
Definition: agmfast.h:12
void Gen(const TSizeTy &_Vals)
Constructs a vector (an array) of _Vals elements.
Definition: ds.h:486
TSizeTy Add()
Adds a new element at the end of the vector, after its current last element.
Definition: ds.h:559

Here is the call graph for this function:

double TAGMFast::GetCom ( const int &  NID,
const int &  CID 
)
inline

Definition at line 57 of file agmfast.h.

References F.

Referenced by GetCmtyVV(), GetStepSizeByLineSearch(), GradientForOneVar(), GradientForRow(), LikelihoodForOneVar(), LikelihoodForRow(), MLEGradAscent(), and MLENewton().

57  {
58  if (F[NID].IsKey(CID)) {
59  return F[NID].GetDat(CID);
60  } else {
61  return 0.0;
62  }
63  }
TVec< TIntFltH > F
Definition: agmfast.h:10

Here is the caller graph for this function:

double TAGMFast::GetRegCoef ( )
inline

Definition at line 29 of file agmfast.h.

References RegCoef.

29 { return RegCoef; }
TFlt RegCoef
Definition: agmfast.h:13
double TAGMFast::GetStepSizeByLineSearch ( const int  UID,
const TIntFltH DeltaV,
const TIntFltH GradV,
const double &  Alpha,
const double &  Beta,
const int  MaxIter = 10 
)

Definition at line 665 of file agmfast.cpp.

References DotProduct(), GetCom(), THash< TKey, TDat, THashFunc >::GetDat(), THash< TKey, TDat, THashFunc >::GetKey(), THash< TKey, TDat, THashFunc >::Len(), LikelihoodForRow(), MaxVal, and MinVal.

Referenced by MLEGradAscent(), and MLEGradAscentParallel().

665  {
666  double StepSize = 1.0;
667  double InitLikelihood = LikelihoodForRow(UID);
668  TIntFltH NewVarV(DeltaV.Len());
669  for(int iter = 0; iter < MaxIter; iter++) {
670  for (int i = 0; i < DeltaV.Len(); i++){
671  int CID = DeltaV.GetKey(i);
672  double NewVal = GetCom(UID, CID) + StepSize * DeltaV.GetDat(CID);
673  if (NewVal < MinVal) { NewVal = MinVal; }
674  if (NewVal > MaxVal) { NewVal = MaxVal; }
675  NewVarV.AddDat(CID, NewVal);
676  }
677  if (LikelihoodForRow(UID, NewVarV) < InitLikelihood + Alpha * StepSize * DotProduct(GradV, DeltaV)) {
678  StepSize *= Beta;
679  } else {
680  break;
681  }
682  if (iter == MaxIter - 1) {
683  StepSize = 0.0;
684  break;
685  }
686  }
687  return StepSize;
688 }
TFlt MinVal
Definition: agmfast.h:19
const TDat & GetDat(const TKey &Key) const
Definition: hash.h:220
double GetCom(const int &NID, const int &CID)
Definition: agmfast.h:57
double LikelihoodForRow(const int UID)
Definition: agmfast.cpp:193
TFlt MaxVal
Definition: agmfast.h:20
double DotProduct(const TIntFltH &UV, const TIntFltH &VV)
Definition: agmfast.h:78
int Len() const
Definition: hash.h:186
const TKey & GetKey(const int &KeyId) const
Definition: hash.h:210

Here is the call graph for this function:

Here is the caller graph for this function:

double TAGMFast::GradientForOneVar ( const TFltV AlphaKV,
const int  UID,
const int  CID,
const double &  Val 
)

Definition at line 339 of file agmfast.cpp.

References F, G, GetCom(), TUNGraph::TNodeI::GetDeg(), TUNGraph::TNodeI::GetNbrNId(), TUNGraph::GetNI(), HOVIDSV, IAssert, NegWgt, RegCoef, and SumFV.

Referenced by MLENewton().

339  {
340  TUNGraph::TNodeI UI = G->GetNI(UID);
341  double Grad = 0.0, PNoEdge;
342  int VID = 0;
343  for (int e = 0; e < UI.GetDeg(); e++) {
344  VID = UI.GetNbrNId(e);
345  if (HOVIDSV[UID].IsKey(UI.GetNbrNId(e))) { continue; }
346  if (! F[VID].IsKey(CID)) { continue; }
347  PNoEdge = AlphaKV[e] * exp (- F[VID].GetDat(CID) * Val);
348  IAssert(PNoEdge <= 1.0 && PNoEdge >= 0.0);
349  //PNoEdge = PNoEdge >= 1.0 - PNoCom? 1 - PNoCom: PNoEdge;
350  Grad += ((PNoEdge * F[VID].GetDat(CID)) / (1.0 - PNoEdge) + NegWgt * F[VID].GetDat(CID));
351  }
352  Grad -= NegWgt * (SumFV[CID] - GetCom(UID, CID));
353  //add regularization
354  if (RegCoef > 0.0) { //L1
355  Grad -= RegCoef;
356  }
357  if (RegCoef < 0.0) { //L2
358  Grad += 2 * RegCoef * Val;
359  }
360 
361  return Grad;
362 }
#define IAssert(Cond)
Definition: bd.h:262
PUNGraph G
Definition: agmfast.h:9
TFlt NegWgt
Definition: agmfast.h:21
Node iterator. Only forward iteration (operator++) is supported.
Definition: graph.h:63
double GetCom(const int &NID, const int &CID)
Definition: agmfast.h:57
int GetDeg() const
Returns degree of the current node.
Definition: graph.h:82
TVec< TIntFltH > F
Definition: agmfast.h:10
TNodeI GetNI(const int &NId) const
Returns an iterator referring to the node of ID NId in the graph.
Definition: graph.h:205
TFltV SumFV
Definition: agmfast.h:14
TVec< TIntSet > HOVIDSV
Definition: agmfast.h:18
TFlt RegCoef
Definition: agmfast.h:13
int GetNbrNId(const int &NodeN) const
Returns ID of NodeN-th neighboring node.
Definition: graph.h:101

Here is the call graph for this function:

Here is the caller graph for this function:

void TAGMFast::GradientForRow ( const int  UID,
TIntFltH GradU,
const TIntSet CIDSet 
)

Definition at line 250 of file agmfast.cpp.

References THash< TKey, TDat, THashFunc >::AddDat(), DoParallel, G, THash< TKey, TDat, THashFunc >::Gen(), TVec< TVal, TSizeTy >::Gen(), GetCom(), TUNGraph::TNodeI::GetDeg(), THashSet< TKey, THashFunc >::GetKey(), TUNGraph::TNodeI::GetNbrNId(), TUNGraph::GetNI(), HOVIDSV, IAssert, THash< TKey, TDat, THashFunc >::Len(), TVec< TVal, TSizeTy >::Len(), THashSet< TKey, THashFunc >::Len(), NegWgt, Prediction(), RegCoef, and SumFV.

Referenced by MLEGradAscent(), and MLEGradAscentParallel().

250  {
251  GradU.Gen(CIDSet.Len());
252 
253  TFltV HOSumFV; //adjust for Fv of v hold out
254  if (HOVIDSV[UID].Len() > 0) {
255  HOSumFV.Gen(SumFV.Len());
256 
257  for (int e = 0; e < HOVIDSV[UID].Len(); e++) {
258  for (int c = 0; c < SumFV.Len(); c++) {
259  HOSumFV[c] += GetCom(HOVIDSV[UID][e], c);
260  }
261  }
262  }
263 
264  TUNGraph::TNodeI NI = G->GetNI(UID);
265  int Deg = NI.GetDeg();
266  TFltV PredV(Deg), GradV(CIDSet.Len());
267  TIntV CIDV(CIDSet.Len());
268  if (DoParallel && Deg + CIDSet.Len() > 10) {
269 #pragma omp parallel for schedule(static, 1)
270  for (int e = 0; e < Deg; e++) {
271  if (NI.GetNbrNId(e) == UID) { continue; }
272  if (HOVIDSV[UID].IsKey(NI.GetNbrNId(e))) { continue; }
273  PredV[e] = Prediction(UID, NI.GetNbrNId(e));
274  }
275 
276 #pragma omp parallel for schedule(static, 1)
277  for (int c = 0; c < CIDSet.Len(); c++) {
278  int CID = CIDSet.GetKey(c);
279  double Val = 0.0;
280  for (int e = 0; e < Deg; e++) {
281  int VID = NI.GetNbrNId(e);
282  if (VID == UID) { continue; }
283  if (HOVIDSV[UID].IsKey(VID)) { continue; }
284  Val += PredV[e] * GetCom(VID, CID) / (1.0 - PredV[e]) + NegWgt * GetCom(VID, CID);
285  }
286  double HOSum = HOVIDSV[UID].Len() > 0? HOSumFV[CID].Val: 0.0;//subtract Hold out pairs only if hold out pairs exist
287  Val -= NegWgt * (SumFV[CID] - HOSum - GetCom(UID, CID));
288  CIDV[c] = CID;
289  GradV[c] = Val;
290  }
291  }
292  else {
293  for (int e = 0; e < Deg; e++) {
294  if (NI.GetNbrNId(e) == UID) { continue; }
295  if (HOVIDSV[UID].IsKey(NI.GetNbrNId(e))) { continue; }
296  PredV[e] = Prediction(UID, NI.GetNbrNId(e));
297  }
298 
299  for (int c = 0; c < CIDSet.Len(); c++) {
300  int CID = CIDSet.GetKey(c);
301  double Val = 0.0;
302  for (int e = 0; e < Deg; e++) {
303  int VID = NI.GetNbrNId(e);
304  if (VID == UID) { continue; }
305  if (HOVIDSV[UID].IsKey(VID)) { continue; }
306  Val += PredV[e] * GetCom(VID, CID) / (1.0 - PredV[e]) + NegWgt * GetCom(VID, CID);
307  }
308  double HOSum = HOVIDSV[UID].Len() > 0? HOSumFV[CID].Val: 0.0;//subtract Hold out pairs only if hold out pairs exist
309  Val -= NegWgt * (SumFV[CID] - HOSum - GetCom(UID, CID));
310  CIDV[c] = CID;
311  GradV[c] = Val;
312  }
313  }
314  //add regularization
315  if (RegCoef > 0.0) { //L1
316  for (int c = 0; c < GradV.Len(); c++) {
317  GradV[c] -= RegCoef;
318  }
319  }
320  if (RegCoef < 0.0) { //L2
321  for (int c = 0; c < GradV.Len(); c++) {
322  GradV[c] += 2 * RegCoef * GetCom(UID, CIDV[c]);
323  }
324  }
325 
326 
327  for (int c = 0; c < GradV.Len(); c++) {
328  if (GetCom(UID, CIDV[c]) == 0.0 && GradV[c] < 0.0) { continue; }
329  if (fabs(GradV[c]) < 0.0001) { continue; }
330  GradU.AddDat(CIDV[c], GradV[c]);
331  }
332  for (int c = 0; c < GradU.Len(); c++) {
333  if (GradU[c] >= 10) { GradU[c] = 10; }
334  if (GradU[c] <= -10) { GradU[c] = -10; }
335  IAssert(GradU[c] >= -10);
336  }
337 }
#define IAssert(Cond)
Definition: bd.h:262
PUNGraph G
Definition: agmfast.h:9
TFlt NegWgt
Definition: agmfast.h:21
TBool DoParallel
Definition: agmfast.h:23
double Prediction(const TIntFltH &FU, const TIntFltH &FV)
Definition: agmfast.h:98
TSizeTy Len() const
Returns the number of elements in the vector.
Definition: ds.h:535
Node iterator. Only forward iteration (operator++) is supported.
Definition: graph.h:63
const TKey & GetKey(const int &KeyId) const
Definition: shash.h:1135
double GetCom(const int &NID, const int &CID)
Definition: agmfast.h:57
int GetDeg() const
Returns degree of the current node.
Definition: graph.h:82
void Gen(const int &ExpectVals)
Definition: hash.h:180
TNodeI GetNI(const int &NId) const
Returns an iterator referring to the node of ID NId in the graph.
Definition: graph.h:205
TFltV SumFV
Definition: agmfast.h:14
int Len() const
Definition: shash.h:1115
TVec< TIntSet > HOVIDSV
Definition: agmfast.h:18
TFlt RegCoef
Definition: agmfast.h:13
int GetNbrNId(const int &NodeN) const
Returns ID of NodeN-th neighboring node.
Definition: graph.h:101
void Gen(const TSizeTy &_Vals)
Constructs a vector (an array) of _Vals elements.
Definition: ds.h:486
int Len() const
Definition: hash.h:186
TDat & AddDat(const TKey &Key)
Definition: hash.h:196
Vector is a sequence TVal objects representing an array that can change in size.
Definition: ds.h:420

Here is the call graph for this function:

Here is the caller graph for this function:

double TAGMFast::HessianForOneVar ( const TFltV AlphaKV,
const int  UID,
const int  CID,
const double &  Val 
)

Definition at line 394 of file agmfast.cpp.

References F, G, TUNGraph::TNodeI::GetDeg(), TUNGraph::TNodeI::GetNbrNId(), TUNGraph::GetNI(), HOVIDSV, IAssert, and RegCoef.

Referenced by MLENewton().

394  {
395  TUNGraph::TNodeI UI = G->GetNI(UID);
396  double H = 0.0, PNoEdge;
397  int VID = 0;
398  for (int e = 0; e < UI.GetDeg(); e++) {
399  VID = UI.GetNbrNId(e);
400  if (HOVIDSV[UID].IsKey(UI.GetNbrNId(e))) { continue; }
401  if (! F[VID].IsKey(CID)) { continue; }
402  PNoEdge = AlphaKV[e] * exp (- F[VID].GetDat(CID) * Val);
403  IAssert(PNoEdge <= 1.0 && PNoEdge >= 0.0);
404  //PNoEdge = PNoEdge == 1.0? 1 - PNoCom: PNoEdge;
405  H += (- PNoEdge * F[VID].GetDat(CID) * F[VID].GetDat(CID)) / (1.0 - PNoEdge) / (1.0 - PNoEdge);
406  }
407  //add regularization
408  if (RegCoef < 0.0) { //L2
409  H += 2 * RegCoef;
410  }
411  IAssert (H <= 0.0);
412  return H;
413 }
#define IAssert(Cond)
Definition: bd.h:262
PUNGraph G
Definition: agmfast.h:9
Node iterator. Only forward iteration (operator++) is supported.
Definition: graph.h:63
int GetDeg() const
Returns degree of the current node.
Definition: graph.h:82
TVec< TIntFltH > F
Definition: agmfast.h:10
TNodeI GetNI(const int &NId) const
Returns an iterator referring to the node of ID NId in the graph.
Definition: graph.h:205
TVec< TIntSet > HOVIDSV
Definition: agmfast.h:18
TFlt RegCoef
Definition: agmfast.h:13
int GetNbrNId(const int &NodeN) const
Returns ID of NodeN-th neighboring node.
Definition: graph.h:101

Here is the call graph for this function:

Here is the caller graph for this function:

double TAGMFast::Likelihood ( const bool  DoParallel = false)

Definition at line 173 of file agmfast.cpp.

References F, and LikelihoodForRow().

Referenced by FindComsByCV(), MLEGradAscent(), MLEGradAscentParallel(), and MLENewton().

173  {
174  TExeTm ExeTm;
175  double L = 0.0;
176  if (_DoParallel) {
177  #pragma omp parallel for
178  for (int u = 0; u < F.Len(); u++) {
179  double LU = LikelihoodForRow(u);
180  #pragma omp atomic
181  L += LU;
182  }
183  }
184  else {
185  for (int u = 0; u < F.Len(); u++) {
186  double LU = LikelihoodForRow(u);
187  L += LU;
188  }
189  }
190  return L;
191 }
Definition: tm.h:354
TVec< TIntFltH > F
Definition: agmfast.h:10
double LikelihoodForRow(const int UID)
Definition: agmfast.cpp:193

Here is the call graph for this function:

Here is the caller graph for this function:

double TAGMFast::LikelihoodForOneVar ( const TFltV AlphaKV,
const int  UID,
const int  CID,
const double &  Val 
)

Definition at line 364 of file agmfast.cpp.

References F, G, GetCom(), TUNGraph::TNodeI::GetDeg(), TUNGraph::TNodeI::GetNbrNId(), TUNGraph::GetNI(), HOVIDSV, IAssert, NegWgt, RegCoef, and SumFV.

364  {
365  TUNGraph::TNodeI UI = G->GetNI(UID);
366  double L = 0.0, PNoEdge;
367  int VID = 0;
368  for (int e = 0; e < UI.GetDeg(); e++) {
369  VID = UI.GetNbrNId(e);
370  if (HOVIDSV[UID].IsKey(UI.GetNbrNId(e))) { continue; }
371  if (! F[VID].IsKey(CID)) {
372  PNoEdge = AlphaKV[e];
373  } else {
374  PNoEdge = AlphaKV[e] * exp (- F[VID].GetDat(CID) * Val);
375  }
376  IAssert(PNoEdge <= 1.0 && PNoEdge >= 0.0);
377  //PNoEdge = PNoEdge >= 1.0 - PNoCom? 1 - PNoCom: PNoEdge;
378  L += log(1.0 - PNoEdge) + NegWgt * GetCom(VID, CID) * Val;
379 
380  // += ((PNoEdge * F[VID].GetDat(CID)) / (1.0 - PNoEdge) + NegWgt * F[VID].GetDat(CID));
381  }
382  L -= NegWgt * (SumFV[CID] - GetCom(UID, CID)) * Val;
383  //add regularization
384  if (RegCoef > 0.0) { //L1
385  L -= RegCoef * Val;
386  }
387  if (RegCoef < 0.0) { //L2
388  L += RegCoef * Val * Val;
389  }
390 
391  return L;
392 }
#define IAssert(Cond)
Definition: bd.h:262
PUNGraph G
Definition: agmfast.h:9
TFlt NegWgt
Definition: agmfast.h:21
Node iterator. Only forward iteration (operator++) is supported.
Definition: graph.h:63
double GetCom(const int &NID, const int &CID)
Definition: agmfast.h:57
int GetDeg() const
Returns degree of the current node.
Definition: graph.h:82
TVec< TIntFltH > F
Definition: agmfast.h:10
TNodeI GetNI(const int &NId) const
Returns an iterator referring to the node of ID NId in the graph.
Definition: graph.h:205
TFltV SumFV
Definition: agmfast.h:14
TVec< TIntSet > HOVIDSV
Definition: agmfast.h:18
TFlt RegCoef
Definition: agmfast.h:13
int GetNbrNId(const int &NodeN) const
Returns ID of NodeN-th neighboring node.
Definition: graph.h:101

Here is the call graph for this function:

double TAGMFast::LikelihoodForRow ( const int  UID)

Definition at line 193 of file agmfast.cpp.

References F.

Referenced by GetStepSizeByLineSearch(), and Likelihood().

193  {
194  return LikelihoodForRow(UID, F[UID]);
195 }
TVec< TIntFltH > F
Definition: agmfast.h:10
double LikelihoodForRow(const int UID)
Definition: agmfast.cpp:193

Here is the caller graph for this function:

double TAGMFast::LikelihoodForRow ( const int  UID,
const TIntFltH FU 
)

Definition at line 198 of file agmfast.cpp.

References THash< TKey, TDat, THashFunc >::BegI(), DoParallel, DotProduct(), THash< TKey, TDat, THashFunc >::EndI(), F, G, TVec< TVal, TSizeTy >::Gen(), GetCom(), TUNGraph::TNodeI::GetDeg(), TUNGraph::TNodeI::GetNbrNId(), TUNGraph::GetNI(), HOVIDSV, TVec< TVal, TSizeTy >::Len(), NegWgt, Norm2(), Prediction(), RegCoef, Sum(), and SumFV.

198  {
199  double L = 0.0;
200  TFltV HOSumFV; //adjust for Fv of v hold out
201  if (HOVIDSV[UID].Len() > 0) {
202  HOSumFV.Gen(SumFV.Len());
203 
204  for (int e = 0; e < HOVIDSV[UID].Len(); e++) {
205  for (int c = 0; c < SumFV.Len(); c++) {
206  HOSumFV[c] += GetCom(HOVIDSV[UID][e], c);
207  }
208  }
209  }
210 
211  TUNGraph::TNodeI NI = G->GetNI(UID);
212  if (DoParallel && NI.GetDeg() > 10) {
213 #pragma omp parallel for schedule(static, 1)
214  for (int e = 0; e < NI.GetDeg(); e++) {
215  int v = NI.GetNbrNId(e);
216  if (v == UID) { continue; }
217  if (HOVIDSV[UID].IsKey(v)) { continue; }
218  double LU = log (1.0 - Prediction(FU, F[v])) + NegWgt * DotProduct(FU, F[v]);
219 #pragma omp atomic
220  L += LU;
221  }
222  for (TIntFltH::TIter HI = FU.BegI(); HI < FU.EndI(); HI++) {
223  double HOSum = HOVIDSV[UID].Len() > 0? HOSumFV[HI.GetKey()].Val: 0.0;//subtract Hold out pairs only if hold out pairs exist
224  double LU = NegWgt * (SumFV[HI.GetKey()] - HOSum - GetCom(UID, HI.GetKey())) * HI.GetDat();
225  L -= LU;
226  }
227  } else {
228  for (int e = 0; e < NI.GetDeg(); e++) {
229  int v = NI.GetNbrNId(e);
230  if (v == UID) { continue; }
231  if (HOVIDSV[UID].IsKey(v)) { continue; }
232  L += log (1.0 - Prediction(FU, F[v])) + NegWgt * DotProduct(FU, F[v]);
233  }
234  for (TIntFltH::TIter HI = FU.BegI(); HI < FU.EndI(); HI++) {
235  double HOSum = HOVIDSV[UID].Len() > 0? HOSumFV[HI.GetKey()].Val: 0.0;//subtract Hold out pairs only if hold out pairs exist
236  L -= NegWgt * (SumFV[HI.GetKey()] - HOSum - GetCom(UID, HI.GetKey())) * HI.GetDat();
237  }
238  }
239  //add regularization
240  if (RegCoef > 0.0) { //L1
241  L -= RegCoef * Sum(FU);
242  }
243  if (RegCoef < 0.0) { //L2
244  L += RegCoef * Norm2(FU);
245  }
246 
247  return L;
248 }
double Norm2(const TIntFltH &UV)
Definition: agmfast.h:113
PUNGraph G
Definition: agmfast.h:9
TFlt NegWgt
Definition: agmfast.h:21
TBool DoParallel
Definition: agmfast.h:23
double Prediction(const TIntFltH &FU, const TIntFltH &FV)
Definition: agmfast.h:98
TIter BegI() const
Definition: hash.h:171
TSizeTy Len() const
Returns the number of elements in the vector.
Definition: ds.h:535
Node iterator. Only forward iteration (operator++) is supported.
Definition: graph.h:63
TIter EndI() const
Definition: hash.h:176
double GetCom(const int &NID, const int &CID)
Definition: agmfast.h:57
int GetDeg() const
Returns degree of the current node.
Definition: graph.h:82
TVec< TIntFltH > F
Definition: agmfast.h:10
TNodeI GetNI(const int &NId) const
Returns an iterator referring to the node of ID NId in the graph.
Definition: graph.h:205
TFltV SumFV
Definition: agmfast.h:14
TVec< TIntSet > HOVIDSV
Definition: agmfast.h:18
TFlt RegCoef
Definition: agmfast.h:13
double Sum(const TIntFltH &UV)
Definition: agmfast.h:106
int GetNbrNId(const int &NodeN) const
Returns ID of NodeN-th neighboring node.
Definition: graph.h:101
void Gen(const TSizeTy &_Vals)
Constructs a vector (an array) of _Vals elements.
Definition: ds.h:486
double DotProduct(const TIntFltH &UV, const TIntFltH &VV)
Definition: agmfast.h:78
Vector is a sequence TVal objects representing an array that can change in size.
Definition: ds.h:420

Here is the call graph for this function:

double TAGMFast::LikelihoodHoldOut ( const bool  DoParallel = false)

Definition at line 647 of file agmfast.cpp.

References G, HOVIDSV, TUNGraph::IsEdge(), NegWgt, and Prediction().

Referenced by FindComsByCV().

647  {
648  double L = 0.0;
649  for (int u = 0; u < HOVIDSV.Len(); u++) {
650  for (int e = 0; e < HOVIDSV[u].Len(); e++) {
651  int VID = HOVIDSV[u][e];
652  if (VID == u) { continue; }
653  double Pred = Prediction(u, VID);
654  if (G->IsEdge(u, VID)) {
655  L += log(1.0 - Pred);
656  }
657  else {
658  L += NegWgt * log(Pred);
659  }
660  }
661  }
662  return L;
663 }
PUNGraph G
Definition: agmfast.h:9
TFlt NegWgt
Definition: agmfast.h:21
double Prediction(const TIntFltH &FU, const TIntFltH &FV)
Definition: agmfast.h:98
TVec< TIntSet > HOVIDSV
Definition: agmfast.h:18
bool IsEdge(const int &SrcNId, const int &DstNId) const
Tests whether an edge between node IDs SrcNId and DstNId exists in the graph.
Definition: graph.cpp:108

Here is the call graph for this function:

Here is the caller graph for this function:

void TAGMFast::Load ( TSIn SIn,
const int &  RndSeed = 0 
)

Definition at line 22 of file agmfast.cpp.

References F, G, HOVIDSV, TUNGraph::Load(), TVec< TVal, TSizeTy >::Load(), TBool::Load(), TInt::Load(), TFlt::Load(), MaxVal, MinVal, NegWgt, NIDV, NodesOk, NumComs, PNoCom, TRnd::PutSeed(), RegCoef, Rnd, and SumFV.

22  {
23  G->Load(SIn);
24  F.Load(SIn);
25  NIDV.Load(SIn);
26  RegCoef.Load(SIn);
27  SumFV.Load(SIn);
28  NodesOk.Load(SIn);
29  MinVal.Load(SIn);
30  MaxVal.Load(SIn);
31  NegWgt.Load(SIn);
32  NumComs.Load(SIn);
33  HOVIDSV.Load(SIn);
34  PNoCom.Load(SIn);
35  Rnd.PutSeed(RndSeed);
36 }
PUNGraph G
Definition: agmfast.h:9
TFlt MinVal
Definition: agmfast.h:19
TFlt NegWgt
Definition: agmfast.h:21
void Load(TSIn &SIn)
Definition: dt.h:901
void Load(TSIn &SIn)
Definition: ds.h:877
TBool NodesOk
Definition: agmfast.h:15
TFlt PNoCom
Definition: agmfast.h:22
TVec< TIntFltH > F
Definition: agmfast.h:10
void Load(TSIn &SIn)
Definition: dt.h:1056
TInt NumComs
Definition: agmfast.h:16
static PUNGraph Load(TSIn &SIn)
Static constructor that loads the graph from a stream SIn and returns a pointer to it...
Definition: graph.h:155
TFltV SumFV
Definition: agmfast.h:14
void PutSeed(const int &_Seed)
Definition: dt.cpp:18
TVec< TIntSet > HOVIDSV
Definition: agmfast.h:18
TFlt RegCoef
Definition: agmfast.h:13
TIntV NIDV
Definition: agmfast.h:12
void Load(TSIn &SIn)
Definition: dt.h:1308
TRnd Rnd
Definition: agmfast.h:11
TFlt MaxVal
Definition: agmfast.h:20

Here is the call graph for this function:

int TAGMFast::MLEGradAscent ( const double &  Thres,
const int &  MaxIter,
const TStr PlotNm,
const double  StepAlpha = 0.3,
const double  StepBeta = 0.1 
)

Definition at line 690 of file agmfast.cpp.

References TVec< TVal, TSizeTy >::Add(), AddCom(), THash< TKey, TDat, THashFunc >::BegI(), DelCom(), TStr::Empty(), THash< TKey, TDat, THashFunc >::EndI(), F, G, GetCom(), TUNGraph::TNodeI::GetDeg(), TUNGraph::TNodeI::GetNbrNId(), TUNGraph::GetNI(), TUNGraph::GetNodes(), GetStepSizeByLineSearch(), TExeTm::GetTmStr(), GradientForRow(), HOVIDSV, IAssert, Likelihood(), TFlt::Mn, Norm2(), TGnuPlot::PlotValV(), and Rnd.

Referenced by FindComsByCV().

690  {
691  time_t InitTime = time(NULL);
692  TExeTm ExeTm, CheckTm;
693  int iter = 0, PrevIter = 0;
694  TIntFltPrV IterLV;
695  TUNGraph::TNodeI UI;
696  double PrevL = TFlt::Mn, CurL = 0.0;
697  TIntV NIdxV(F.Len(), 0);
698  for (int i = 0; i < F.Len(); i++) { NIdxV.Add(i); }
699  IAssert(NIdxV.Len() == F.Len());
700  TIntFltH GradV;
701  while(iter < MaxIter) {
702  NIdxV.Shuffle(Rnd);
703  for (int ui = 0; ui < F.Len(); ui++, iter++) {
704  int u = NIdxV[ui]; //
705  //find set of candidate c (we only need to consider c to which a neighbor of u belongs to)
706  UI = G->GetNI(u);
707  TIntSet CIDSet(5 * UI.GetDeg());
708  for (int e = 0; e < UI.GetDeg(); e++) {
709  if (HOVIDSV[u].IsKey(UI.GetNbrNId(e))) { continue; }
710  TIntFltH& NbhCIDH = F[UI.GetNbrNId(e)];
711  for (TIntFltH::TIter CI = NbhCIDH.BegI(); CI < NbhCIDH.EndI(); CI++) {
712  CIDSet.AddKey(CI.GetKey());
713  }
714  }
715  for (TIntFltH::TIter CI = F[u].BegI(); CI < F[u].EndI(); CI++) { //remove the community membership which U does not share with its neighbors
716  if (! CIDSet.IsKey(CI.GetKey())) {
717  DelCom(u, CI.GetKey());
718  }
719  }
720  if (CIDSet.Empty()) { continue; }
721  GradientForRow(u, GradV, CIDSet);
722  if (Norm2(GradV) < 1e-4) { continue; }
723  double LearnRate = GetStepSizeByLineSearch(u, GradV, GradV, StepAlpha, StepBeta);
724  if (LearnRate == 0.0) { continue; }
725  for (int ci = 0; ci < GradV.Len(); ci++) {
726  int CID = GradV.GetKey(ci);
727  double Change = LearnRate * GradV.GetDat(CID);
728  double NewFuc = GetCom(u, CID) + Change;
729  if (NewFuc <= 0.0) {
730  DelCom(u, CID);
731  } else {
732  AddCom(u, CID, NewFuc);
733  }
734  }
735  if (! PlotNm.Empty() && (iter + 1) % G->GetNodes() == 0) {
736  IterLV.Add(TIntFltPr(iter, Likelihood(false)));
737  }
738  }
739  printf("\r%d iterations (%f) [%lu sec]", iter, CurL, time(NULL) - InitTime);
740  fflush(stdout);
741  if (iter - PrevIter >= 2 * G->GetNodes() && iter > 10000) {
742  PrevIter = iter;
743  CurL = Likelihood();
744  if (PrevL > TFlt::Mn && ! PlotNm.Empty()) {
745  printf("\r%d iterations, Likelihood: %f, Diff: %f", iter, CurL, CurL - PrevL);
746  }
747  fflush(stdout);
748  if (CurL - PrevL <= Thres * fabs(PrevL)) { break; }
749  else { PrevL = CurL; }
750  }
751 
752  }
753  printf("\n");
754  printf("MLE for Lambda completed with %d iterations(%s)\n", iter, ExeTm.GetTmStr());
755  if (! PlotNm.Empty()) {
756  TGnuPlot::PlotValV(IterLV, PlotNm + ".likelihood_Q");
757  }
758  return iter;
759 }
#define IAssert(Cond)
Definition: bd.h:262
double Norm2(const TIntFltH &UV)
Definition: agmfast.h:113
PUNGraph G
Definition: agmfast.h:9
Definition: tm.h:354
void DelCom(const int &NID, const int &CID)
Definition: agmfast.h:72
TIter BegI() const
Definition: hash.h:171
Node iterator. Only forward iteration (operator++) is supported.
Definition: graph.h:63
TIter EndI() const
Definition: hash.h:176
int GetNodes() const
Returns the number of nodes in the graph.
Definition: graph.h:162
double GetCom(const int &NID, const int &CID)
Definition: agmfast.h:57
TPair< TInt, TFlt > TIntFltPr
Definition: ds.h:87
int GetDeg() const
Returns degree of the current node.
Definition: graph.h:82
const char * GetTmStr() const
Definition: tm.h:369
TVec< TIntFltH > F
Definition: agmfast.h:10
double GetStepSizeByLineSearch(const int UID, const TIntFltH &DeltaV, const TIntFltH &GradV, const double &Alpha, const double &Beta, const int MaxIter=10)
Definition: agmfast.cpp:665
TNodeI GetNI(const int &NId) const
Returns an iterator referring to the node of ID NId in the graph.
Definition: graph.h:205
TVec< TIntSet > HOVIDSV
Definition: agmfast.h:18
bool Empty() const
Definition: dt.h:488
static void PlotValV(const TVec< TPair< TVal1, TVal2 > > &ValV, const TStr &OutFNmPref, const TStr &Desc="", const TStr &XLabel="", const TStr &YLabel="", const TGpScaleTy &ScaleTy=gpsAuto, const bool &PowerFit=false, const TGpSeriesTy &SeriesTy=gpwLinesPoints)
Definition: gnuplot.h:363
TRnd Rnd
Definition: agmfast.h:11
double Likelihood(const bool DoParallel=false)
Definition: agmfast.cpp:173
int GetNbrNId(const int &NodeN) const
Returns ID of NodeN-th neighboring node.
Definition: graph.h:101
void AddCom(const int &NID, const int &CID, const double &Val)
Definition: agmfast.h:64
TSizeTy Add()
Adds a new element at the end of the vector, after its current last element.
Definition: ds.h:559
void GradientForRow(const int UID, TIntFltH &GradU, const TIntSet &CIDSet)
Definition: agmfast.cpp:250
static const double Mn
Definition: dt.h:1293
Vector is a sequence TVal objects representing an array that can change in size.
Definition: ds.h:420

Here is the call graph for this function:

Here is the caller graph for this function:

int TAGMFast::MLEGradAscentParallel ( const double &  Thres,
const int &  MaxIter,
const int  ChunkNum,
const int  ChunkSize,
const TStr PlotNm,
const double  StepAlpha = 0.3,
const double  StepBeta = 0.1 
)

Definition at line 761 of file agmfast.cpp.

References TVec< TVal, TSizeTy >::Add(), THash< TKey, TDat, THashFunc >::BegI(), TStr::Empty(), THashKeyDatI< TKey, TDat >::EndI, THash< TKey, TDat, THashFunc >::EndI(), F, G, TSecTm::GetAbsSecs(), TSecTm::GetCurTm(), THash< TKey, TDat, THashFunc >::GetDat(), TVec< TVal, TSizeTy >::GetDat(), TUNGraph::TNodeI::GetDeg(), THash< TKey, TDat, THashFunc >::GetKey(), TUNGraph::TNodeI::GetNbrNId(), TUNGraph::GetNI(), TUNGraph::GetNodes(), GetStepSizeByLineSearch(), TExeTm::GetTmStr(), GradientForRow(), HOVIDSV, IAssert, THash< TKey, TDat, THashFunc >::Len(), TVec< TVal, TSizeTy >::Len(), Likelihood(), TFlt::Mn, Norm2(), TGnuPlot::PlotValV(), TVec< TVal, TSizeTy >::PutAll(), Rnd, and SumFV.

Referenced by FindComsByCV(), and MLEGradAscentParallel().

761  {
762  //parallel
763  time_t InitTime = time(NULL);
764  uint64 StartTm = TSecTm::GetCurTm().GetAbsSecs();
765  TExeTm ExeTm, CheckTm;
766  double PrevL = Likelihood(true);
767  TIntFltPrV IterLV;
768  int PrevIter = 0;
769  int iter = 0;
770  TIntV NIdxV(F.Len(), 0);
771  for (int i = 0; i < F.Len(); i++) { NIdxV.Add(i); }
772  TIntV NIDOPTV(F.Len()); //check if a node needs optimization or not 1: does not require optimization
773  NIDOPTV.PutAll(0);
774  TVec<TIntFltH> NewF(ChunkNum * ChunkSize);
775  TIntV NewNIDV(ChunkNum * ChunkSize);
776  for (iter = 0; iter < MaxIter; iter++) {
777  NIdxV.Clr(false);
778  for (int i = 0; i < F.Len(); i++) {
779  if (NIDOPTV[i] == 0) { NIdxV.Add(i); }
780  }
781  IAssert (NIdxV.Len() <= F.Len());
782  NIdxV.Shuffle(Rnd);
783  // compute gradient for chunk of nodes
784 #pragma omp parallel for schedule(static, 1)
785  for (int TIdx = 0; TIdx < ChunkNum; TIdx++) {
786  TIntFltH GradV;
787  for (int ui = TIdx * ChunkSize; ui < (TIdx + 1) * ChunkSize; ui++) {
788  NewNIDV[ui] = -1;
789  if (ui > NIdxV.Len()) { continue; }
790  int u = NIdxV[ui]; //
791  //find set of candidate c (we only need to consider c to which a neighbor of u belongs to)
792  TUNGraph::TNodeI UI = G->GetNI(u);
793  TIntSet CIDSet(5 * UI.GetDeg());
794  TIntFltH CurFU = F[u];
795  for (int e = 0; e < UI.GetDeg(); e++) {
796  if (HOVIDSV[u].IsKey(UI.GetNbrNId(e))) { continue; }
797  TIntFltH& NbhCIDH = F[UI.GetNbrNId(e)];
798  for (TIntFltH::TIter CI = NbhCIDH.BegI(); CI < NbhCIDH.EndI(); CI++) {
799  CIDSet.AddKey(CI.GetKey());
800  }
801  }
802  if (CIDSet.Empty()) {
803  CurFU.Clr();
804  }
805  else {
806  for (TIntFltH::TIter CI = CurFU.BegI(); CI < CurFU.EndI(); CI++) { //remove the community membership which U does not share with its neighbors
807  if (! CIDSet.IsKey(CI.GetKey())) {
808  CurFU.DelIfKey(CI.GetKey());
809  }
810  }
811  GradientForRow(u, GradV, CIDSet);
812  if (Norm2(GradV) < 1e-4) { NIDOPTV[u] = 1; continue; }
813  double LearnRate = GetStepSizeByLineSearch(u, GradV, GradV, StepAlpha, StepBeta, 5);
814  if (LearnRate == 0.0) { NewNIDV[ui] = -2; continue; }
815  for (int ci = 0; ci < GradV.Len(); ci++) {
816  int CID = GradV.GetKey(ci);
817  double Change = LearnRate * GradV.GetDat(CID);
818  double NewFuc = CurFU.IsKey(CID)? CurFU.GetDat(CID) + Change : Change;
819  if (NewFuc <= 0.0) {
820  CurFU.DelIfKey(CID);
821  } else {
822  CurFU.AddDat(CID) = NewFuc;
823  }
824  }
825  CurFU.Defrag();
826  }
827  //store changes
828  NewF[ui] = CurFU;
829  NewNIDV[ui] = u;
830  }
831  }
832  int NumNoChangeGrad = 0;
833  int NumNoChangeStepSize = 0;
834  for (int ui = 0; ui < NewNIDV.Len(); ui++) {
835  int NewNID = NewNIDV[ui];
836  if (NewNID == -1) { NumNoChangeGrad++; continue; }
837  if (NewNID == -2) { NumNoChangeStepSize++; continue; }
838  for (TIntFltH::TIter CI = F[NewNID].BegI(); CI < F[NewNID].EndI(); CI++) {
839  SumFV[CI.GetKey()] -= CI.GetDat();
840  }
841  }
842 #pragma omp parallel for
843  for (int ui = 0; ui < NewNIDV.Len(); ui++) {
844  int NewNID = NewNIDV[ui];
845  if (NewNID < 0) { continue; }
846  F[NewNID] = NewF[ui];
847  }
848  for (int ui = 0; ui < NewNIDV.Len(); ui++) {
849  int NewNID = NewNIDV[ui];
850  if (NewNID < 0) { continue; }
851  for (TIntFltH::TIter CI = F[NewNID].BegI(); CI < F[NewNID].EndI(); CI++) {
852  SumFV[CI.GetKey()] += CI.GetDat();
853  }
854  }
855  // update the nodes who are optimal
856  for (int ui = 0; ui < NewNIDV.Len(); ui++) {
857  int NewNID = NewNIDV[ui];
858  if (NewNID < 0) { continue; }
859  TUNGraph::TNodeI UI = G->GetNI(NewNID);
860  NIDOPTV[NewNID] = 0;
861  for (int e = 0; e < UI.GetDeg(); e++) {
862  NIDOPTV[UI.GetNbrNId(e)] = 0;
863  }
864  }
865  int OPTCnt = 0;
866  for (int i = 0; i < NIDOPTV.Len(); i++) { if (NIDOPTV[i] == 1) { OPTCnt++; } }
867  if (! PlotNm.Empty()) {
868  printf("\r%d iterations [%s] %d secs", iter * ChunkSize * ChunkNum, ExeTm.GetTmStr(), int(TSecTm::GetCurTm().GetAbsSecs() - StartTm));
869  if (PrevL > TFlt::Mn) { printf(" (%f) %d g %d s %d OPT", PrevL, NumNoChangeGrad, NumNoChangeStepSize, OPTCnt); }
870  fflush(stdout);
871  }
872  if ((iter - PrevIter) * ChunkSize * ChunkNum >= G->GetNodes()) {
873  PrevIter = iter;
874  double CurL = Likelihood(true);
875  IterLV.Add(TIntFltPr(iter * ChunkSize * ChunkNum, CurL));
876  printf("\r%d iterations, Likelihood: %f, Diff: %f [%d secs]", iter, CurL, CurL - PrevL, int(time(NULL) - InitTime));
877  fflush(stdout);
878  if (CurL - PrevL <= Thres * fabs(PrevL)) {
879  break;
880  }
881  else {
882  PrevL = CurL;
883  }
884  }
885  }
886  if (! PlotNm.Empty()) {
887  printf("\nMLE completed with %d iterations(%d secs)\n", iter, int(TSecTm::GetCurTm().GetAbsSecs() - StartTm));
888  TGnuPlot::PlotValV(IterLV, PlotNm + ".likelihood_Q");
889  } else {
890  printf("\rMLE completed with %d iterations(%d secs)", iter, int(time(NULL) - InitTime));
891  fflush(stdout);
892  }
893  return iter;
894 }
#define IAssert(Cond)
Definition: bd.h:262
double Norm2(const TIntFltH &UV)
Definition: agmfast.h:113
PUNGraph G
Definition: agmfast.h:9
Definition: tm.h:354
TIter BegI() const
Definition: hash.h:171
Node iterator. Only forward iteration (operator++) is supported.
Definition: graph.h:63
const TDat & GetDat(const TKey &Key) const
Definition: hash.h:220
TIter EndI() const
Definition: hash.h:176
int GetNodes() const
Returns the number of nodes in the graph.
Definition: graph.h:162
TPair< TInt, TFlt > TIntFltPr
Definition: ds.h:87
int GetDeg() const
Returns degree of the current node.
Definition: graph.h:82
const char * GetTmStr() const
Definition: tm.h:369
TVec< TIntFltH > F
Definition: agmfast.h:10
double GetStepSizeByLineSearch(const int UID, const TIntFltH &DeltaV, const TIntFltH &GradV, const double &Alpha, const double &Beta, const int MaxIter=10)
Definition: agmfast.cpp:665
void PutAll(const TVal &Val)
Sets all elements of the vector to value Val.
Definition: ds.h:1130
unsigned long long uint64
Definition: bd.h:38
const TVal & GetDat(const TVal &Val) const
Returns reference to the first occurrence of element Val.
Definition: ds.h:792
uint GetAbsSecs() const
Definition: tm.h:149
TNodeI GetNI(const int &NId) const
Returns an iterator referring to the node of ID NId in the graph.
Definition: graph.h:205
TFltV SumFV
Definition: agmfast.h:14
TVec< TIntSet > HOVIDSV
Definition: agmfast.h:18
bool Empty() const
Definition: dt.h:488
static void PlotValV(const TVec< TPair< TVal1, TVal2 > > &ValV, const TStr &OutFNmPref, const TStr &Desc="", const TStr &XLabel="", const TStr &YLabel="", const TGpScaleTy &ScaleTy=gpsAuto, const bool &PowerFit=false, const TGpSeriesTy &SeriesTy=gpwLinesPoints)
Definition: gnuplot.h:363
TRnd Rnd
Definition: agmfast.h:11
static TSecTm GetCurTm()
Definition: tm.cpp:692
double Likelihood(const bool DoParallel=false)
Definition: agmfast.cpp:173
int GetNbrNId(const int &NodeN) const
Returns ID of NodeN-th neighboring node.
Definition: graph.h:101
TSizeTy Add()
Adds a new element at the end of the vector, after its current last element.
Definition: ds.h:559
int Len() const
Definition: hash.h:186
void GradientForRow(const int UID, TIntFltH &GradU, const TIntSet &CIDSet)
Definition: agmfast.cpp:250
THKeyDat * EndI
Definition: hash.h:45
const TKey & GetKey(const int &KeyId) const
Definition: hash.h:210
static const double Mn
Definition: dt.h:1293
Vector is a sequence TVal objects representing an array that can change in size.
Definition: ds.h:420

Here is the call graph for this function:

Here is the caller graph for this function:

int TAGMFast::MLEGradAscentParallel ( const double &  Thres,
const int &  MaxIter,
const int  ChunkNum,
const TStr PlotNm = TStr(),
const double  StepAlpha = 0.3,
const double  StepBeta = 0.1 
)
inline

Definition at line 49 of file agmfast.h.

References G, TUNGraph::GetNodes(), and MLEGradAscentParallel().

49  {
50  int ChunkSize = G->GetNodes() / 10 / ChunkNum;
51  if (ChunkSize == 0) { ChunkSize = 1; }
52  return MLEGradAscentParallel(Thres, MaxIter, ChunkNum, ChunkSize, PlotNm, StepAlpha, StepBeta);
53  }
PUNGraph G
Definition: agmfast.h:9
int MLEGradAscentParallel(const double &Thres, const int &MaxIter, const int ChunkNum, const int ChunkSize, const TStr &PlotNm, const double StepAlpha=0.3, const double StepBeta=0.1)
Definition: agmfast.cpp:761
int GetNodes() const
Returns the number of nodes in the graph.
Definition: graph.h:162

Here is the call graph for this function:

int TAGMFast::MLENewton ( const double &  Thres,
const int &  MaxIter,
const TStr PlotNm = TStr() 
)

Newton method: DEPRECATED.

Definition at line 416 of file agmfast.cpp.

References TVec< TVal, TSizeTy >::Add(), AddCom(), THashSet< TKey, THashFunc >::AddKey(), THash< TKey, TDat, THashFunc >::BegI(), THashSet< TKey, THashFunc >::BegI(), DelCom(), DotProduct(), TStr::Empty(), THashSet< TKey, THashFunc >::Empty(), THash< TKey, TDat, THashFunc >::EndI(), THashSet< TKey, THashFunc >::EndI(), F, TStr::Fmt(), G, GetCom(), TUNGraph::TNodeI::GetDeg(), TUNGraph::TNodeI::GetNbrNId(), TUNGraph::GetNI(), TUNGraph::GetNIdV(), TUNGraph::GetNodes(), TExeTm::GetTmStr(), GradientForOneVar(), HessianForOneVar(), HOVIDSV, IAssertR, THashSet< TKey, THashFunc >::IsKey(), Likelihood(), TFlt::Mn, TGnuPlot::PlotValV(), PNoCom, Rnd, TVec< TVal, TSizeTy >::Shuffle(), and TFlt::Val.

416  {
417  TExeTm ExeTm;
418  int iter = 0, PrevIter = 0;
419  TIntFltPrV IterLV;
420  double PrevL = TFlt::Mn, CurL;
421  TUNGraph::TNodeI UI;
422  TIntV NIdxV;
423  G->GetNIdV(NIdxV);
424  int CID, UID, NewtonIter;
425  double Fuc;
426  //double PrevFuc;
427  double Grad, H;
428  while(iter < MaxIter) {
429  NIdxV.Shuffle(Rnd);
430  for (int ui = 0; ui < F.Len(); ui++, iter++) {
431  if (! PlotNm.Empty() && iter % G->GetNodes() == 0) {
432  IterLV.Add(TIntFltPr(iter, Likelihood(false)));
433  }
434  UID = NIdxV[ui];
435  //find set of candidate c (we only need to consider c to which a neighbor of u belongs to)
436  TIntSet CIDSet;
437  UI = G->GetNI(UID);
438  if (UI.GetDeg() == 0) { //if the node is isolated, clear its membership and skip
439  if (! F[UID].Empty()) { F[UID].Clr(); }
440  continue;
441  }
442  for (int e = 0; e < UI.GetDeg(); e++) {
443  if (HOVIDSV[UID].IsKey(UI.GetNbrNId(e))) { continue; }
444  TIntFltH& NbhCIDH = F[UI.GetNbrNId(e)];
445  for (TIntFltH::TIter CI = NbhCIDH.BegI(); CI < NbhCIDH.EndI(); CI++) {
446  CIDSet.AddKey(CI.GetKey());
447  }
448  }
449  for (TIntFltH::TIter CI = F[UID].BegI(); CI < F[UID].EndI(); CI++) { //remove the community membership which U does not share with its neighbors
450  if (! CIDSet.IsKey(CI.GetKey())) {
451  DelCom(UID, CI.GetKey());
452  }
453  }
454  if (CIDSet.Empty()) { continue; }
455  for (TIntSet::TIter CI = CIDSet.BegI(); CI < CIDSet.EndI(); CI++) {
456  CID = CI.GetKey();
457  //optimize for UID, CID
458  //compute constants
459  TFltV AlphaKV(UI.GetDeg());
460  for (int e = 0; e < UI.GetDeg(); e++) {
461  if (HOVIDSV[UID].IsKey(UI.GetNbrNId(e))) { continue; }
462  AlphaKV[e] = (1 - PNoCom) * exp(- DotProduct(UID, UI.GetNbrNId(e)) + GetCom(UI.GetNbrNId(e), CID) * GetCom(UID, CID));
463  IAssertR(AlphaKV[e] <= 1.0, TStr::Fmt("AlphaKV=%f, %f, %f", AlphaKV[e].Val, PNoCom.Val, GetCom(UI.GetNbrNId(e), CID)));
464  }
465  Fuc = GetCom(UID, CID);
466  //PrevFuc = Fuc;
467  Grad = GradientForOneVar(AlphaKV, UID, CID, Fuc), H = 0.0;
468  if (Grad <= 1e-3 && Grad >= -0.1) { continue; }
469  NewtonIter = 0;
470  while (NewtonIter++ < 10) {
471  Grad = GradientForOneVar(AlphaKV, UID, CID, Fuc), H = 0.0;
472  H = HessianForOneVar(AlphaKV, UID, CID, Fuc);
473  if (Fuc == 0.0 && Grad <= 0.0) { Grad = 0.0; }
474  if (fabs(Grad) < 1e-3) { break; }
475  if (H == 0.0) { Fuc = 0.0; break; }
476  double NewtonStep = - Grad / H;
477  if (NewtonStep < -0.5) { NewtonStep = - 0.5; }
478  Fuc += NewtonStep;
479  if (Fuc < 0.0) { Fuc = 0.0; }
480  }
481  if (Fuc == 0.0) {
482  DelCom(UID, CID);
483  }
484  else {
485  AddCom(UID, CID, Fuc);
486  }
487  }
488  }
489  if (iter - PrevIter >= 2 * G->GetNodes() && iter > 10000) {
490  PrevIter = iter;
491  CurL = Likelihood();
492  if (PrevL > TFlt::Mn && ! PlotNm.Empty()) {
493  printf("\r%d iterations, Likelihood: %f, Diff: %f", iter, CurL, CurL - PrevL);
494  }
495  fflush(stdout);
496  if (CurL - PrevL <= Thres * fabs(PrevL)) { break; }
497  else { PrevL = CurL; }
498  }
499 
500  }
501  if (! PlotNm.Empty()) {
502  printf("\nMLE for Lambda completed with %d iterations(%s)\n", iter, ExeTm.GetTmStr());
503  TGnuPlot::PlotValV(IterLV, PlotNm + ".likelihood_Q");
504  }
505  return iter;
506 }
PUNGraph G
Definition: agmfast.h:9
#define IAssertR(Cond, Reason)
Definition: bd.h:265
Definition: tm.h:354
double GradientForOneVar(const TFltV &AlphaKV, const int UID, const int CID, const double &Val)
Definition: agmfast.cpp:339
TIter BegI() const
Definition: shash.h:1099
void GetNIdV(TIntV &NIdV) const
Gets a vector IDs of all nodes in the graph.
Definition: graph.cpp:125
void DelCom(const int &NID, const int &CID)
Definition: agmfast.h:72
TIter BegI() const
Definition: hash.h:171
double Val
Definition: dt.h:1291
Node iterator. Only forward iteration (operator++) is supported.
Definition: graph.h:63
TIter EndI() const
Definition: hash.h:176
bool IsKey(const TKey &Key) const
Definition: shash.h:1142
int GetNodes() const
Returns the number of nodes in the graph.
Definition: graph.h:162
double GetCom(const int &NID, const int &CID)
Definition: agmfast.h:57
TFlt PNoCom
Definition: agmfast.h:22
TPair< TInt, TFlt > TIntFltPr
Definition: ds.h:87
int GetDeg() const
Returns degree of the current node.
Definition: graph.h:82
const char * GetTmStr() const
Definition: tm.h:369
bool Empty() const
Definition: shash.h:1114
TVec< TIntFltH > F
Definition: agmfast.h:10
TNodeI GetNI(const int &NId) const
Returns an iterator referring to the node of ID NId in the graph.
Definition: graph.h:205
int AddKey(const TKey &Key)
Definition: shash.h:1248
TIter EndI() const
Definition: shash.h:1106
double HessianForOneVar(const TFltV &AlphaKV, const int UID, const int CID, const double &Val)
Definition: agmfast.cpp:394
TVec< TIntSet > HOVIDSV
Definition: agmfast.h:18
bool Empty() const
Definition: dt.h:488
static void PlotValV(const TVec< TPair< TVal1, TVal2 > > &ValV, const TStr &OutFNmPref, const TStr &Desc="", const TStr &XLabel="", const TStr &YLabel="", const TGpScaleTy &ScaleTy=gpsAuto, const bool &PowerFit=false, const TGpSeriesTy &SeriesTy=gpwLinesPoints)
Definition: gnuplot.h:363
static TStr Fmt(const char *FmtStr,...)
Definition: dt.cpp:1599
void Shuffle(TRnd &Rnd)
Randomly shuffles the elements of the vector.
Definition: ds.h:1235
TRnd Rnd
Definition: agmfast.h:11
double Likelihood(const bool DoParallel=false)
Definition: agmfast.cpp:173
int GetNbrNId(const int &NodeN) const
Returns ID of NodeN-th neighboring node.
Definition: graph.h:101
void AddCom(const int &NID, const int &CID, const double &Val)
Definition: agmfast.h:64
TSizeTy Add()
Adds a new element at the end of the vector, after its current last element.
Definition: ds.h:559
double DotProduct(const TIntFltH &UV, const TIntFltH &VV)
Definition: agmfast.h:78
static const double Mn
Definition: dt.h:1293
Vector is a sequence TVal objects representing an array that can change in size.
Definition: ds.h:420

Here is the call graph for this function:

void TAGMFast::NeighborComInit ( const int  InitComs)

Definition at line 60 of file agmfast.cpp.

References AddCom(), F, G, TVec< TVal, TSizeTy >::Gen(), TAGMUtil::GetConductance(), TUNGraph::TNodeI::GetDeg(), TUNGraph::GetEdges(), TAGMUtil::GetNbhCom(), TUNGraph::TNodeI::GetNbrNId(), TUNGraph::GetNI(), TUNGraph::GetNodes(), TExeTm::GetTmStr(), TRnd::GetUniDev(), TRnd::GetUniDevInt(), IAssert, TVec< TVal, TSizeTy >::Len(), NumComs, Rnd, and SumFV.

Referenced by FindComsByCV().

60  {
61  //initialize with best neighborhood communities (Gleich et.al. KDD'12)
62  F.Gen(G->GetNodes());
63  SumFV.Gen(InitComs);
64  NumComs = InitComs;
65  const int Edges = G->GetEdges();
66  //TIntFltH NCPhiH(F.Len());
67  TFltIntPrV NIdPhiV(F.Len(), 0);
68  TIntSet InvalidNIDS(F.Len());
69  TIntV ChosenNIDV(InitComs, 0); //FOR DEBUG
70  TExeTm RunTm;
71  //compute conductance of neighborhood community
72  for (int u = 0; u < F.Len(); u++) {
73  TIntSet NBCmty(G->GetNI(u).GetDeg() + 1);
74  double Phi;
75  if (G->GetNI(u).GetDeg() < 5) { //do not include nodes with too few degree
76  Phi = 1.0;
77  } else {
78  TAGMUtil::GetNbhCom(G, u, NBCmty);
79  IAssert(NBCmty.Len() == G->GetNI(u).GetDeg() + 1);
80  Phi = TAGMUtil::GetConductance(G, NBCmty, Edges);
81  }
82  //NCPhiH.AddDat(u, Phi);
83  NIdPhiV.Add(TFltIntPr(Phi, u));
84  }
85  NIdPhiV.Sort(true);
86  printf("conductance computation completed [%s]\n", RunTm.GetTmStr());
87  fflush(stdout);
88  //choose nodes with local minimum in conductance
89  int CurCID = 0;
90  for (int ui = 0; ui < NIdPhiV.Len(); ui++) {
91  int UID = NIdPhiV[ui].Val2;
92  fflush(stdout);
93  if (InvalidNIDS.IsKey(UID)) { continue; }
94  ChosenNIDV.Add(UID); //FOR DEBUG
95  //add the node and its neighbors to the current community
96  AddCom(UID, CurCID, 1.0);
97  TUNGraph::TNodeI NI = G->GetNI(UID);
98  fflush(stdout);
99  for (int e = 0; e < NI.GetDeg(); e++) {
100  AddCom(NI.GetNbrNId(e), CurCID, 1.0);
101  }
102  //exclude its neighbors from the next considerations
103  for (int e = 0; e < NI.GetDeg(); e++) {
104  InvalidNIDS.AddKey(NI.GetNbrNId(e));
105  }
106  CurCID++;
107  fflush(stdout);
108  if (CurCID >= NumComs) { break; }
109  }
110  if (NumComs > CurCID) {
111  printf("%d communities needed to fill randomly\n", NumComs - CurCID);
112  }
113  //assign a member to zero-member community (if any)
114  for (int c = 0; c < SumFV.Len(); c++) {
115  if (SumFV[c] == 0.0) {
116  int ComSz = 10;
117  for (int u = 0; u < ComSz; u++) {
118  int UID = Rnd.GetUniDevInt(G->GetNodes());
119  AddCom(UID, c, Rnd.GetUniDev());
120  }
121  }
122  }
123 }
#define IAssert(Cond)
Definition: bd.h:262
static void GetNbhCom(const PUNGraph &Graph, const int NID, TIntSet &NBCmtyS)
Definition: agm.cpp:706
PUNGraph G
Definition: agmfast.h:9
TPair< TFlt, TInt > TFltIntPr
Definition: ds.h:97
Definition: tm.h:354
int GetEdges() const
Returns the number of edges in the graph.
Definition: graph.cpp:74
TSizeTy Len() const
Returns the number of elements in the vector.
Definition: ds.h:535
Node iterator. Only forward iteration (operator++) is supported.
Definition: graph.h:63
static double GetConductance(const PUNGraph &Graph, const TIntSet &CmtyS, const int Edges)
Definition: agm.cpp:683
int GetNodes() const
Returns the number of nodes in the graph.
Definition: graph.h:162
int GetDeg() const
Returns degree of the current node.
Definition: graph.h:82
const char * GetTmStr() const
Definition: tm.h:369
TVec< TIntFltH > F
Definition: agmfast.h:10
TNodeI GetNI(const int &NId) const
Returns an iterator referring to the node of ID NId in the graph.
Definition: graph.h:205
TInt NumComs
Definition: agmfast.h:16
TFltV SumFV
Definition: agmfast.h:14
TRnd Rnd
Definition: agmfast.h:11
double GetUniDev()
Definition: dt.h:30
int GetNbrNId(const int &NodeN) const
Returns ID of NodeN-th neighboring node.
Definition: graph.h:101
void Gen(const TSizeTy &_Vals)
Constructs a vector (an array) of _Vals elements.
Definition: ds.h:486
void AddCom(const int &NID, const int &CID, const double &Val)
Definition: agmfast.h:64
int GetUniDevInt(const int &Range=0)
Definition: dt.cpp:39
Vector is a sequence TVal objects representing an array that can change in size.
Definition: ds.h:420

Here is the call graph for this function:

Here is the caller graph for this function:

double TAGMFast::Norm2 ( const TIntFltH UV)
inline

Definition at line 113 of file agmfast.h.

References THash< TKey, TDat, THashFunc >::BegI(), and THash< TKey, TDat, THashFunc >::EndI().

Referenced by LikelihoodForRow(), MLEGradAscent(), and MLEGradAscentParallel().

113  {
114  double N = 0.0;
115  for (TIntFltH::TIter HI = UV.BegI(); HI < UV.EndI(); HI++) {
116  N += HI.GetDat() * HI.GetDat();
117  }
118  return N;
119  }
TIter BegI() const
Definition: hash.h:171
TIter EndI() const
Definition: hash.h:176

Here is the call graph for this function:

Here is the caller graph for this function:

double TAGMFast::Prediction ( const TIntFltH FU,
const TIntFltH FV 
)
inline

Definition at line 98 of file agmfast.h.

References DotProduct(), TStr::Fmt(), IAssertR, and PNoCom.

Referenced by GradientForRow(), LikelihoodForRow(), LikelihoodHoldOut(), and Prediction().

98  {
99  double DP = log (1.0 / (1.0 - PNoCom)) + DotProduct(FU, FV);
100  IAssertR(DP > 0.0, TStr::Fmt("DP: %f", DP));
101  return exp(- DP);
102  }
#define IAssertR(Cond, Reason)
Definition: bd.h:265
TFlt PNoCom
Definition: agmfast.h:22
static TStr Fmt(const char *FmtStr,...)
Definition: dt.cpp:1599
double DotProduct(const TIntFltH &UV, const TIntFltH &VV)
Definition: agmfast.h:78

Here is the call graph for this function:

Here is the caller graph for this function:

double TAGMFast::Prediction ( const int &  UID,
const int &  VID 
)
inline

Definition at line 103 of file agmfast.h.

References F, and Prediction().

103  {
104  return Prediction(F[UID], F[VID]);
105  }
double Prediction(const TIntFltH &FU, const TIntFltH &FV)
Definition: agmfast.h:98
TVec< TIntFltH > F
Definition: agmfast.h:10

Here is the call graph for this function:

void TAGMFast::RandomInit ( const int  InitComs)

Definition at line 38 of file agmfast.cpp.

References AddCom(), F, G, TVec< TVal, TSizeTy >::Gen(), TUNGraph::TNodeI::GetDeg(), TUNGraph::GetNI(), TUNGraph::GetNodes(), TRnd::GetUniDev(), TRnd::GetUniDevInt(), TVec< TVal, TSizeTy >::Len(), NumComs, Rnd, and SumFV.

Referenced by FindComsByCV(), and TAGMFast().

38  {
39  F.Gen(G->GetNodes());
40  SumFV.Gen(InitComs);
41  NumComs = InitComs;
42  for (int u = 0; u < F.Len(); u++) {
43  //assign to just one community
44  int Mem = G->GetNI(u).GetDeg();
45  if (Mem > 10) { Mem = 10; }
46  for (int c = 0; c < Mem; c++) {
47  int CID = Rnd.GetUniDevInt(InitComs);
48  AddCom(u, CID, Rnd.GetUniDev());
49  }
50  }
51  //assign a member to zero-member community (if any)
52  for (int c = 0; c < SumFV.Len(); c++) {
53  if (SumFV[c] == 0.0) {
54  int UID = Rnd.GetUniDevInt(G->GetNodes());
55  AddCom(UID, c, Rnd.GetUniDev());
56  }
57  }
58 }
PUNGraph G
Definition: agmfast.h:9
TSizeTy Len() const
Returns the number of elements in the vector.
Definition: ds.h:535
int GetNodes() const
Returns the number of nodes in the graph.
Definition: graph.h:162
int GetDeg() const
Returns degree of the current node.
Definition: graph.h:82
TVec< TIntFltH > F
Definition: agmfast.h:10
TNodeI GetNI(const int &NId) const
Returns an iterator referring to the node of ID NId in the graph.
Definition: graph.h:205
TInt NumComs
Definition: agmfast.h:16
TFltV SumFV
Definition: agmfast.h:14
TRnd Rnd
Definition: agmfast.h:11
double GetUniDev()
Definition: dt.h:30
void Gen(const TSizeTy &_Vals)
Constructs a vector (an array) of _Vals elements.
Definition: ds.h:486
void AddCom(const int &NID, const int &CID, const double &Val)
Definition: agmfast.h:64
int GetUniDevInt(const int &Range=0)
Definition: dt.cpp:39

Here is the call graph for this function:

Here is the caller graph for this function:

void TAGMFast::Save ( TSOut SOut)

Definition at line 7 of file agmfast.cpp.

References F, G, HOVIDSV, MaxVal, MinVal, NegWgt, NIDV, NodesOk, NumComs, PNoCom, RegCoef, TUNGraph::Save(), TVec< TVal, TSizeTy >::Save(), TBool::Save(), TInt::Save(), TFlt::Save(), and SumFV.

7  {
8  G->Save(SOut);
9  F.Save(SOut);
10  NIDV.Save(SOut);
11  RegCoef.Save(SOut);
12  SumFV.Save(SOut);
13  NodesOk.Save(SOut);
14  MinVal.Save(SOut);
15  MaxVal.Save(SOut);
16  NegWgt.Save(SOut);
17  NumComs.Save(SOut);
18  HOVIDSV.Save(SOut);
19  PNoCom.Save(SOut);
20 }
PUNGraph G
Definition: agmfast.h:9
TFlt MinVal
Definition: agmfast.h:19
TFlt NegWgt
Definition: agmfast.h:21
void Save(TSOut &SOut) const
Definition: dt.h:1057
void Save(TSOut &SOut) const
Definition: dt.h:902
void Save(TSOut &SOut) const
Saves the graph to a (binary) stream SOut.
Definition: graph.h:147
void Save(TSOut &SOut) const
Definition: ds.h:885
TBool NodesOk
Definition: agmfast.h:15
TFlt PNoCom
Definition: agmfast.h:22
TVec< TIntFltH > F
Definition: agmfast.h:10
TInt NumComs
Definition: agmfast.h:16
TFltV SumFV
Definition: agmfast.h:14
TVec< TIntSet > HOVIDSV
Definition: agmfast.h:18
TFlt RegCoef
Definition: agmfast.h:13
TIntV NIDV
Definition: agmfast.h:12
TFlt MaxVal
Definition: agmfast.h:20
void Save(TSOut &SOut) const
Definition: dt.h:1305

Here is the call graph for this function:

void TAGMFast::SetCmtyVV ( const TVec< TIntV > &  CmtyVV)

Definition at line 125 of file agmfast.cpp.

References AddCom(), THash< TKey, TDat, THashFunc >::AddDat(), F, G, TVec< TVal, TSizeTy >::Gen(), TVec< TVal, TSizeTy >::GetDat(), TUNGraph::GetNodes(), TUNGraph::IsNode(), TVec< TVal, TSizeTy >::Len(), NIDV, NodesOk, NumComs, and SumFV.

125  {
126  F.Gen(G->GetNodes());
127  SumFV.Gen(CmtyVV.Len());
128  NumComs = CmtyVV.Len();
129  TIntH NIDIdxH(NIDV.Len());
130  if (! NodesOk) {
131  for (int u = 0; u < NIDV.Len(); u++) {
132  NIDIdxH.AddDat(NIDV[u], u);
133  }
134  }
135  for (int c = 0; c < CmtyVV.Len(); c++) {
136  for (int u = 0; u < CmtyVV[c].Len(); u++) {
137  int UID = CmtyVV[c][u];
138  if (! NodesOk) { UID = NIDIdxH.GetDat(UID); }
139  if (G->IsNode(UID)) {
140  AddCom(UID, c, 1.0);
141  }
142  }
143  }
144 }
PUNGraph G
Definition: agmfast.h:9
TSizeTy Len() const
Returns the number of elements in the vector.
Definition: ds.h:535
int GetNodes() const
Returns the number of nodes in the graph.
Definition: graph.h:162
TBool NodesOk
Definition: agmfast.h:15
TVec< TIntFltH > F
Definition: agmfast.h:10
const TVal & GetDat(const TVal &Val) const
Returns reference to the first occurrence of element Val.
Definition: ds.h:792
TInt NumComs
Definition: agmfast.h:16
TFltV SumFV
Definition: agmfast.h:14
TIntV NIDV
Definition: agmfast.h:12
bool IsNode(const int &NId) const
Tests whether ID NId is a node.
Definition: graph.h:199
void Gen(const TSizeTy &_Vals)
Constructs a vector (an array) of _Vals elements.
Definition: ds.h:486
void AddCom(const int &NID, const int &CID, const double &Val)
Definition: agmfast.h:64
TDat & AddDat(const TKey &Key)
Definition: hash.h:196

Here is the call graph for this function:

void TAGMFast::SetGraph ( const PUNGraph GraphPt)

Definition at line 146 of file agmfast.cpp.

References TSnap::DelSelfEdges(), DoParallel, G, TUNGraph::GetNIdV(), TUNGraph::GetNodes(), TSnap::GetSubGraph(), HOVIDSV, IAssert, TUNGraph::IsNode(), TFlt::Mx, NegWgt, NIDV, NodesOk, and PNoCom.

Referenced by TAGMFast().

146  {
147  G = GraphPt;
148  HOVIDSV.Gen(G->GetNodes());
149  NodesOk = true;
150  GraphPt->GetNIdV(NIDV);
151  // check that nodes IDs are {0,1,..,Nodes-1}
152  for (int nid = 0; nid < GraphPt->GetNodes(); nid++) {
153  if (! GraphPt->IsNode(nid)) {
154  NodesOk = false;
155  break;
156  }
157  }
158  if (! NodesOk) {
159  printf("rearrage nodes\n");
160  G = TSnap::GetSubGraph(GraphPt, NIDV, true);
161  for (int nid = 0; nid < G->GetNodes(); nid++) {
162  IAssert(G->IsNode(nid));
163  }
164  }
166 
167  PNoCom = 1.0 / (double) G->GetNodes();
168  DoParallel = false;
169  if (1.0 / PNoCom > sqrt(TFlt::Mx)) { PNoCom = 0.99 / sqrt(TFlt::Mx); } // to prevent overflow
170  NegWgt = 1.0;
171 }
#define IAssert(Cond)
Definition: bd.h:262
PUNGraph G
Definition: agmfast.h:9
TFlt NegWgt
Definition: agmfast.h:21
TBool DoParallel
Definition: agmfast.h:23
void GetNIdV(TIntV &NIdV) const
Gets a vector IDs of all nodes in the graph.
Definition: graph.cpp:125
static const double Mx
Definition: dt.h:1294
int GetNodes() const
Returns the number of nodes in the graph.
Definition: graph.h:162
TBool NodesOk
Definition: agmfast.h:15
TFlt PNoCom
Definition: agmfast.h:22
PUNGraph GetSubGraph(const PUNGraph &Graph, const TIntV &NIdV, const bool &RenumberNodes)
Returns an induced subgraph of an undirected graph Graph with NIdV nodes with an optional node renumb...
Definition: subgraph.cpp:7
TVec< TIntSet > HOVIDSV
Definition: agmfast.h:18
TIntV NIDV
Definition: agmfast.h:12
bool IsNode(const int &NId) const
Tests whether ID NId is a node.
Definition: graph.h:199
void DelSelfEdges(const PGraph &Graph)
Removes all the self-edges from the graph.
Definition: alg.h:419

Here is the call graph for this function:

Here is the caller graph for this function:

void TAGMFast::SetRegCoef ( const double  _RegCoef)
inline

Definition at line 28 of file agmfast.h.

References RegCoef.

28 { RegCoef = _RegCoef; }
TFlt RegCoef
Definition: agmfast.h:13
double TAGMFast::Sum ( const TIntFltH UV)
inline

Definition at line 106 of file agmfast.h.

References THash< TKey, TDat, THashFunc >::BegI(), and THash< TKey, TDat, THashFunc >::EndI().

Referenced by LikelihoodForRow().

106  {
107  double N = 0.0;
108  for (TIntFltH::TIter HI = UV.BegI(); HI < UV.EndI(); HI++) {
109  N += HI.GetDat();
110  }
111  return N;
112  }
TIter BegI() const
Definition: hash.h:171
TIter EndI() const
Definition: hash.h:176

Here is the call graph for this function:

Here is the caller graph for this function:

Member Data Documentation

TBool TAGMFast::DoParallel

Definition at line 23 of file agmfast.h.

Referenced by GradientForRow(), LikelihoodForRow(), and SetGraph().

TFlt TAGMFast::MaxVal

Definition at line 20 of file agmfast.h.

Referenced by GetStepSizeByLineSearch(), Load(), and Save().

TFlt TAGMFast::MinVal

Definition at line 19 of file agmfast.h.

Referenced by GetStepSizeByLineSearch(), Load(), and Save().

TIntV TAGMFast::NIDV
private

Definition at line 12 of file agmfast.h.

Referenced by GetCmtyVV(), Load(), Save(), SetCmtyVV(), and SetGraph().

TBool TAGMFast::NodesOk
private

Definition at line 15 of file agmfast.h.

Referenced by GetCmtyVV(), Load(), Save(), SetCmtyVV(), and SetGraph().

TInt TAGMFast::NumComs
private

Definition at line 16 of file agmfast.h.

Referenced by GetCmtyVV(), Load(), NeighborComInit(), RandomInit(), Save(), and SetCmtyVV().

TFlt TAGMFast::PNoCom

Definition at line 22 of file agmfast.h.

Referenced by Load(), MLENewton(), Prediction(), Save(), and SetGraph().

TRnd TAGMFast::Rnd
private

The documentation for this class was generated from the following files: