SNAP Library 4.0, User Reference
2017-07-27 13:18:06
SNAP, a general purpose, high performance system for analysis and manipulation of large networks
|
Table class: Relational table with columnar data storage. More...
#include <table.h>
Classes | |
class | TLoadVecInit |
Public Member Functions | |
void | AddIntCol (const TStr &ColName) |
Adds an integer column with name ColName . More... | |
void | AddFltCol (const TStr &ColName) |
Adds a float column with name ColName . More... | |
void | AddStrCol (const TStr &ColName) |
Adds a string column with name ColName . More... | |
void | GroupByIntColMP (const TStr &GroupBy, THashMP< TInt, TIntV > &Grouping, TBool UsePhysicalIds=true) const |
Groups/hashes by a single column with integer values, using OpenMP multi-threading. More... | |
TTable () | |
TTable (TTableContext *Context) | |
TTable (const Schema &S, TTableContext *Context) | |
TTable (TSIn &SIn, TTableContext *Context) | |
TTable (const THash< TInt, TInt > &H, const TStr &Col1, const TStr &Col2, TTableContext *Context, const TBool IsStrKeys=false) | |
Constructor to build table out of a hash table of int->int. More... | |
TTable (const THash< TInt, TFlt > &H, const TStr &Col1, const TStr &Col2, TTableContext *Context, const TBool IsStrKeys=false) | |
Constructor to build table out of a hash table of int->float. More... | |
TTable (const TTable &Table) | |
Copy constructor. More... | |
TTable (const TTable &Table, const TIntV &RowIds) | |
void | SaveSS (const TStr &OutFNm) |
Saves table schema and content to a TSV file. More... | |
void | SaveBin (const TStr &OutFNm) |
Saves table schema and content to a binary file. More... | |
void | Save (TSOut &SOut) |
Saves table schema and content to a binary format. More... | |
void | Dump (FILE *OutF=stdout) const |
Prints table contents to a text file. More... | |
void | AddRow (const TTableRow &Row) |
Adds row with values taken from given TTableRow. More... | |
TTableContext * | GetContext () |
Returns the context. More... | |
TTableContext * | ChangeContext (TTableContext *Context) |
Changes the current context. Moves all object items to the new context. More... | |
TInt | GetColIdx (const TStr &ColName) const |
Gets index of column ColName among columns of the same type in the schema. More... | |
TInt | GetIntVal (const TStr &ColName, const TInt &RowIdx) |
Gets the value of integer attribute ColName at row RowIdx . More... | |
TFlt | GetFltVal (const TStr &ColName, const TInt &RowIdx) |
Gets the value of float attribute ColName at row RowIdx . More... | |
TStr | GetStrVal (const TStr &ColName, const TInt &RowIdx) const |
Gets the value of string attribute ColName at row RowIdx . More... | |
TInt | GetStrMapById (TInt ColIdx, TInt RowIdx) const |
Gets the integer mapping of the string at column ColIdx at row RowIdx . More... | |
TInt | GetStrMapByName (const TStr &ColName, TInt RowIdx) const |
Gets the integer mapping of the string at column ColName at row RowIdx . More... | |
TStr | GetStrValById (TInt ColIdx, TInt RowIdx) const |
Gets the value of the string attribute at column ColIdx at row RowIdx . More... | |
TStr | GetStrValByName (const TStr &ColName, const TInt &RowIdx) const |
Gets the value of the string attribute at column ColName at row RowIdx . More... | |
TIntV | GetIntRowIdxByVal (const TStr &ColName, const TInt &Val) const |
Gets the rows containing Val in int column ColName . More... | |
TIntV | GetStrRowIdxByMap (const TStr &ColName, const TInt &Map) const |
Gets the rows containing int mapping Map in str column ColName . More... | |
TIntV | GetFltRowIdxByVal (const TStr &ColName, const TFlt &Val) const |
Gets the rows containing Val in flt column ColName . More... | |
TInt | RequestIndexInt (const TStr &ColName) |
Creates Index for Int Column ColName . More... | |
TInt | RequestIndexFlt (const TStr &ColName) |
Creates Index for Flt Column ColName . More... | |
TInt | RequestIndexStrMap (const TStr &ColName) |
Creates Index for Str Column ColName . More... | |
TStr | GetStr (const TInt &KeyId) const |
Gets the string with KeyId . More... | |
TInt | GetIntValAtRowIdx (const TInt &ColIdx, const TInt &RowIdx) |
Get the integer value at column ColIdx and row RowIdx . More... | |
TFlt | GetFltValAtRowIdx (const TInt &ColIdx, const TInt &RowIdx) |
Get the float value at column ColIdx and row RowIdx . More... | |
Schema | GetSchema () |
Gets the schema of this table. More... | |
TVec< PNEANet > | ToGraphSequence (TStr SplitAttr, TAttrAggr AggrPolicy, TInt WindowSize, TInt JumpSize, TInt StartVal=TInt::Mn, TInt EndVal=TInt::Mx) |
Creates a sequence of graphs based on values of column SplitAttr and windows specified by JumpSize and WindowSize. More... | |
TVec< PNEANet > | ToVarGraphSequence (TStr SplitAttr, TAttrAggr AggrPolicy, TIntPrV SplitIntervals) |
Creates a sequence of graphs based on values of column SplitAttr and intervals specified by SplitIntervals. More... | |
TVec< PNEANet > | ToGraphPerGroup (TStr GroupAttr, TAttrAggr AggrPolicy) |
Creates a sequence of graphs based on grouping specified by GroupAttr. More... | |
PNEANet | ToGraphSequenceIterator (TStr SplitAttr, TAttrAggr AggrPolicy, TInt WindowSize, TInt JumpSize, TInt StartVal=TInt::Mn, TInt EndVal=TInt::Mx) |
Creates the graph sequence one at a time. More... | |
PNEANet | ToVarGraphSequenceIterator (TStr SplitAttr, TAttrAggr AggrPolicy, TIntPrV SplitIntervals) |
Creates the graph sequence one at a time. More... | |
PNEANet | ToGraphPerGroupIterator (TStr GroupAttr, TAttrAggr AggrPolicy) |
Creates the graph sequence one at a time. More... | |
PNEANet | NextGraphIterator () |
Calls to this must be preceded by a call to one of the above ToGraph*Iterator functions. More... | |
TBool | IsLastGraphOfSequence () |
Checks if the end of the graph sequence is reached. More... | |
TStr | GetSrcCol () const |
Gets the name of the column to be used as src nodes in the graph. More... | |
void | SetSrcCol (const TStr &Src) |
Sets the name of the column to be used as src nodes in the graph. More... | |
TStr | GetDstCol () const |
Gets the name of the column to be used as dst nodes in the graph. More... | |
void | SetDstCol (const TStr &Dst) |
Sets the name of the column to be used as dst nodes in the graph. More... | |
void | AddEdgeAttr (const TStr &Attr) |
Adds column to be used as graph edge attribute. More... | |
void | AddEdgeAttr (TStrV &Attrs) |
Adds columns to be used as graph edge attributes. More... | |
void | AddSrcNodeAttr (const TStr &Attr) |
Adds column to be used as src node atribute of the graph. More... | |
void | AddSrcNodeAttr (TStrV &Attrs) |
Adds columns to be used as src node attributes of the graph. More... | |
void | AddDstNodeAttr (const TStr &Attr) |
Adds column to be used as dst node atribute of the graph. More... | |
void | AddDstNodeAttr (TStrV &Attrs) |
Adds columns to be used as dst node attributes of the graph. More... | |
void | AddNodeAttr (const TStr &Attr) |
Handles the common case where src and dst both belong to the same "universe" of entities. More... | |
void | AddNodeAttr (TStrV &Attrs) |
Handles the common case where src and dst both belong to the same "universe" of entities. More... | |
void | SetCommonNodeAttrs (const TStr &SrcAttr, const TStr &DstAttr, const TStr &CommonAttrName) |
Sets the columns to be used as both src and dst node attributes. More... | |
TStrV | GetSrcNodeIntAttrV () const |
Gets src node int attribute name vector. More... | |
TStrV | GetDstNodeIntAttrV () const |
Gets dst node int attribute name vector. More... | |
TStrV | GetEdgeIntAttrV () const |
Gets edge int attribute name vector. More... | |
TStrV | GetSrcNodeFltAttrV () const |
Gets src node float attribute name vector. More... | |
TStrV | GetDstNodeFltAttrV () const |
Gets dst node float attribute name vector. More... | |
TStrV | GetEdgeFltAttrV () const |
Gets edge float attribute name vector. More... | |
TStrV | GetSrcNodeStrAttrV () const |
Gets src node str attribute name vector. More... | |
TStrV | GetDstNodeStrAttrV () const |
Gets dst node str attribute name vector. More... | |
TStrV | GetEdgeStrAttrV () const |
Gets edge str attribute name vector. More... | |
TAttrType | GetColType (const TStr &ColName) const |
Gets type of column ColName . More... | |
TInt | GetNumRows () const |
Gets total number of rows in this table. More... | |
TInt | GetNumValidRows () const |
Gets number of valid, i.e. not deleted, rows in this table. More... | |
THash< TInt, TInt > | GetRowIdMap () const |
Gets a map of logical to physical row ids. More... | |
TRowIterator | BegRI () const |
Gets iterator to the first valid row of the table. More... | |
TRowIterator | EndRI () const |
Gets iterator to the last valid row of the table. More... | |
TRowIteratorWithRemove | BegRIWR () |
Gets iterator with reomve to the first valid row. More... | |
TRowIteratorWithRemove | EndRIWR () |
Gets iterator with reomve to the last valid row. More... | |
void | GetPartitionRanges (TIntPrV &Partitions, TInt NumPartitions) const |
Partitions the table into NumPartitions and populate Partitions with the ranges. More... | |
void | Rename (const TStr &Column, const TStr &NewLabel) |
Renames a column. More... | |
void | Unique (const TStr &Col) |
Removes rows with duplicate values in given column. More... | |
void | Unique (const TStrV &Cols, TBool Ordered=true) |
Removes rows with duplicate values in given columns. More... | |
void | Select (TPredicate &Predicate, TIntV &SelectedRows, TBool Remove=true) |
Selects rows that satisfy given Predicate . More... | |
void | Select (TPredicate &Predicate) |
void | Classify (TPredicate &Predicate, const TStr &LabelName, const TInt &PositiveLabel=1, const TInt &NegativeLabel=0) |
void | SelectAtomic (const TStr &Col1, const TStr &Col2, TPredComp Cmp, TIntV &SelectedRows, TBool Remove=true) |
Selects rows using atomic compare operation. More... | |
void | SelectAtomic (const TStr &Col1, const TStr &Col2, TPredComp Cmp) |
void | ClassifyAtomic (const TStr &Col1, const TStr &Col2, TPredComp Cmp, const TStr &LabelName, const TInt &PositiveLabel=1, const TInt &NegativeLabel=0) |
void | SelectAtomicConst (const TStr &Col, const TPrimitive &Val, TPredComp Cmp, TIntV &SelectedRows, PTable &SelectedTable, TBool Remove=true, TBool Table=true) |
Selects rows where the value of Col matches given primitive Val . More... | |
template<class T > | |
void | SelectAtomicConst (const TStr &Col, const T &Val, TPredComp Cmp) |
template<class T > | |
void | SelectAtomicConst (const TStr &Col, const T &Val, TPredComp Cmp, PTable &SelectedTable) |
template<class T > | |
void | ClassifyAtomicConst (const TStr &Col, const T &Val, TPredComp Cmp, const TStr &LabelName, const TInt &PositiveLabel=1, const TInt &NegativeLabel=0) |
void | SelectAtomicIntConst (const TStr &Col, const TInt &Val, TPredComp Cmp) |
void | SelectAtomicIntConst (const TStr &Col, const TInt &Val, TPredComp Cmp, PTable &SelectedTable) |
void | SelectAtomicStrConst (const TStr &Col, const TStr &Val, TPredComp Cmp) |
void | SelectAtomicStrConst (const TStr &Col, const TStr &Val, TPredComp Cmp, PTable &SelectedTable) |
void | SelectAtomicFltConst (const TStr &Col, const TFlt &Val, TPredComp Cmp) |
void | SelectAtomicFltConst (const TStr &Col, const TFlt &Val, TPredComp Cmp, PTable &SelectedTable) |
void | Group (const TStrV &GroupBy, const TStr &GroupColName, TBool Ordered=true, TBool UsePhysicalIds=true) |
Groups rows depending on values of GroupBy columns. More... | |
void | Count (const TStr &CountColName, const TStr &Col) |
Counts number of unique elements. More... | |
void | Order (const TStrV &OrderBy, TStr OrderColName="", TBool ResetRankByMSC=false, TBool Asc=true) |
Orders the rows according to the values in columns of OrderBy (in descending lexicographic order). More... | |
void | Aggregate (const TStrV &GroupByAttrs, TAttrAggr AggOp, const TStr &ValAttr, const TStr &ResAttr, TBool Ordered=true) |
Aggregates values of ValAttr after grouping with respect to GroupByAttrs. Result are stored as new attribute ResAttr. More... | |
void | AggregateCols (const TStrV &AggrAttrs, TAttrAggr AggOp, const TStr &ResAttr) |
Aggregates attributes in AggrAttrs across columns. More... | |
TVec< PTable > | SpliceByGroup (const TStrV &GroupByAttrs, TBool Ordered=true) |
Splices table into subtables according to a grouping statement. More... | |
PTable | Join (const TStr &Col1, const TTable &Table, const TStr &Col2) |
Performs equijoin. More... | |
PTable | Join (const TStr &Col1, const PTable &Table, const TStr &Col2) |
PTable | ThresholdJoin (const TStr &KeyCol1, const TStr &JoinCol1, const TTable &Table, const TStr &KeyCol2, const TStr &JoinCol2, TInt Threshold, TBool PerJoinKey=false) |
PTable | SelfJoin (const TStr &Col) |
Joins table with itself, on values of Col . More... | |
PTable | SelfSimJoin (const TStrV &Cols, const TStr &DistanceColName, const TSimType &SimType, const TFlt &Threshold) |
PTable | SelfSimJoinPerGroup (const TStr &GroupAttr, const TStr &SimCol, const TStr &DistanceColName, const TSimType &SimType, const TFlt &Threshold) |
Performs join if the distance between two rows is less than the specified threshold. More... | |
PTable | SelfSimJoinPerGroup (const TStrV &GroupBy, const TStr &SimCol, const TStr &DistanceColName, const TSimType &SimType, const TFlt &Threshold) |
Performs join if the distance between two rows is less than the specified threshold. More... | |
PTable | SimJoin (const TStrV &Cols1, const TTable &Table, const TStrV &Cols2, const TStr &DistanceColName, const TSimType &SimType, const TFlt &Threshold) |
Performs join if the distance between two rows is less than the specified threshold. More... | |
void | SelectFirstNRows (const TInt &N) |
Selects first N rows from the table. More... | |
void | Defrag () |
Releases memory of deleted rows, and defrags. More... | |
void | StoreIntCol (const TStr &ColName, const TIntV &ColVals) |
Adds entire int column to table. More... | |
void | StoreFltCol (const TStr &ColName, const TFltV &ColVals) |
Adds entire flt column to table. More... | |
void | StoreStrCol (const TStr &ColName, const TStrV &ColVals) |
Adds entire str column to table. More... | |
void | UpdateFltFromTable (const TStr &KeyAttr, const TStr &UpdateAttr, const TTable &Table, const TStr &FKeyAttr, const TStr &ReadAttr, TFlt DefaultFltVal=0.0) |
void | UpdateFltFromTableMP (const TStr &KeyAttr, const TStr &UpdateAttr, const TTable &Table, const TStr &FKeyAttr, const TStr &ReadAttr, TFlt DefaultFltVal=0.0) |
void | SetFltColToConstMP (TInt UpdateColIdx, TFlt DefaultFltVal) |
PTable | Union (const TTable &Table) |
Returns union of this table with given Table . More... | |
PTable | Union (const PTable &Table) |
PTable | UnionAll (const TTable &Table) |
Returns union of this table with given Table , preserving duplicates. More... | |
PTable | UnionAll (const PTable &Table) |
void | UnionAllInPlace (const TTable &Table) |
Same as TTable::ConcatTable. More... | |
void | UnionAllInPlace (const PTable &Table) |
PTable | Intersection (const TTable &Table) |
Returns intersection of this table with given Table . More... | |
PTable | Intersection (const PTable &Table) |
PTable | Minus (TTable &Table) |
Returns table with rows that are present in this table but not in given Table . More... | |
PTable | Minus (const PTable &Table) |
PTable | Project (const TStrV &ProjectCols) |
Returns table with only the columns in ProjectCols . More... | |
void | ProjectInPlace (const TStrV &ProjectCols) |
Keeps only the columns specified in ProjectCols . More... | |
void | ColGenericOp (const TStr &Attr1, const TStr &Attr2, const TStr &ResAttr, TArithOp op) |
Performs columnwise arithmetic operation. More... | |
void | ColGenericOpMP (TInt ArgColIdx1, TInt ArgColIdx2, TAttrType ArgType1, TAttrType ArgType2, TInt ResColIdx, TArithOp op) |
void | ColAdd (const TStr &Attr1, const TStr &Attr2, const TStr &ResultAttrName="") |
Performs columnwise addition. See TTable::ColGenericOp. More... | |
void | ColSub (const TStr &Attr1, const TStr &Attr2, const TStr &ResultAttrName="") |
Performs columnwise subtraction. See TTable::ColGenericOp. More... | |
void | ColMul (const TStr &Attr1, const TStr &Attr2, const TStr &ResultAttrName="") |
Performs columnwise multiplication. See TTable::ColGenericOp. More... | |
void | ColDiv (const TStr &Attr1, const TStr &Attr2, const TStr &ResultAttrName="") |
Performs columnwise division. See TTable::ColGenericOp. More... | |
void | ColMod (const TStr &Attr1, const TStr &Attr2, const TStr &ResultAttrName="") |
Performs columnwise modulus. See TTable::ColGenericOp. More... | |
void | ColMin (const TStr &Attr1, const TStr &Attr2, const TStr &ResultAttrName="") |
Performs min of two columns. See TTable::ColGenericOp. More... | |
void | ColMax (const TStr &Attr1, const TStr &Attr2, const TStr &ResultAttrName="") |
Performs max of two columns. See TTable::ColGenericOp. More... | |
void | ColGenericOp (const TStr &Attr1, TTable &Table, const TStr &Attr2, const TStr &ResAttr, TArithOp op, TBool AddToFirstTable) |
Performs columnwise arithmetic operation with column of given table. More... | |
void | ColAdd (const TStr &Attr1, TTable &Table, const TStr &Attr2, const TStr &ResAttr="", TBool AddToFirstTable=true) |
Performs columnwise addition with column of given table. More... | |
void | ColSub (const TStr &Attr1, TTable &Table, const TStr &Attr2, const TStr &ResAttr="", TBool AddToFirstTable=true) |
Performs columnwise subtraction with column of given table. More... | |
void | ColMul (const TStr &Attr1, TTable &Table, const TStr &Attr2, const TStr &ResAttr="", TBool AddToFirstTable=true) |
Performs columnwise multiplication with column of given table. More... | |
void | ColDiv (const TStr &Attr1, TTable &Table, const TStr &Attr2, const TStr &ResAttr="", TBool AddToFirstTable=true) |
Performs columnwise division with column of given table. More... | |
void | ColMod (const TStr &Attr1, TTable &Table, const TStr &Attr2, const TStr &ResAttr="", TBool AddToFirstTable=true) |
Performs columnwise modulus with column of given table. More... | |
void | ColGenericOp (const TStr &Attr1, const TFlt &Num, const TStr &ResAttr, TArithOp op, const TBool floatCast) |
Performs arithmetic op of column values and given Num . More... | |
void | ColGenericOpMP (const TInt &ColIdx1, const TInt &ColIdx2, TAttrType ArgType, const TFlt &Num, TArithOp op, TBool ShouldCast) |
void | ColAdd (const TStr &Attr1, const TFlt &Num, const TStr &ResultAttrName="", const TBool floatCast=false) |
Performs addition of column values and given Num . More... | |
void | ColSub (const TStr &Attr1, const TFlt &Num, const TStr &ResultAttrName="", const TBool floatCast=false) |
Performs subtraction of column values and given Num . More... | |
void | ColMul (const TStr &Attr1, const TFlt &Num, const TStr &ResultAttrName="", const TBool floatCast=false) |
Performs multiplication of column values and given Num . More... | |
void | ColDiv (const TStr &Attr1, const TFlt &Num, const TStr &ResultAttrName="", const TBool floatCast=false) |
Performs division of column values and given Num . More... | |
void | ColMod (const TStr &Attr1, const TFlt &Num, const TStr &ResultAttrName="", const TBool floatCast=false) |
Performs modulus of column values and given Num . More... | |
void | ColConcat (const TStr &Attr1, const TStr &Attr2, const TStr &Sep="", const TStr &ResAttr="") |
Concatenates two string columns. More... | |
void | ColConcat (const TStr &Attr1, TTable &Table, const TStr &Attr2, const TStr &Sep="", const TStr &ResAttr="", TBool AddToFirstTable=true) |
Concatenates string column with column of given table. More... | |
void | ColConcatConst (const TStr &Attr1, const TStr &Val, const TStr &Sep="", const TStr &ResAttr="") |
Concatenates column values with given string value. More... | |
void | ReadIntCol (const TStr &ColName, TIntV &Result) const |
Reads values of entire int column into Result . More... | |
void | ReadFltCol (const TStr &ColName, TFltV &Result) const |
Reads values of entire float column into Result . More... | |
void | ReadStrCol (const TStr &ColName, TStrV &Result) const |
Reads values of entire string column into Result . More... | |
void | InitIds () |
Adds explicit row ids, initialize hash set mapping ids to physical rows. More... | |
PTable | IsNextK (const TStr &OrderCol, TInt K, const TStr &GroupBy, const TStr &RankColName="") |
Distance based filter. More... | |
void | PrintSize () |
void | PrintContextSize () |
TSize | GetMemUsedKB () |
Returns approximate memory used by table in [KB]. More... | |
TSize | GetContextMemUsedKB () |
Returns approximate memory used by table context in [KB]. More... | |
Static Public Member Functions | |
static void | SetMP (TInt Value) |
static TInt | GetMP () |
static TStr | NormalizeColName (const TStr &ColName) |
Adds suffix to column name if it doesn't exist. More... | |
static TStrV | NormalizeColNameV (const TStrV &Cols) |
Adds suffix to column name if it doesn't exist. More... | |
static PTable | New () |
static PTable | New (TTableContext *Context) |
static PTable | New (const Schema &S, TTableContext *Context) |
static PTable | New (const THash< TInt, TInt > &H, const TStr &Col1, const TStr &Col2, TTableContext *Context, const TBool IsStrKeys=false) |
Returns pointer to a table constructed from given int->int hash. More... | |
static PTable | New (const THash< TInt, TFlt > &H, const TStr &Col1, const TStr &Col2, TTableContext *Context, const TBool IsStrKeys=false) |
Returns pointer to a table constructed from given int->float hash. More... | |
static PTable | New (const PTable Table) |
Returns pointer to a new table created from given Table . More... | |
static void | GetSchema (const TStr &InFNm, Schema &S, const char &Separator= '\t') |
Returns pointer to a new table created from given Table , with name set to TableName . More... | |
static PTable | LoadSS (const Schema &S, const TStr &InFNm, TTableContext *Context, const char &Separator= '\t', TBool HasTitleLine=false) |
Loads table from spread sheet (TSV, CSV, etc). Note: HasTitleLine = true is not supported. Please comment title lines instead. More... | |
static PTable | LoadSS (const Schema &S, const TStr &InFNm, TTableContext *Context, const TIntV &RelevantCols, const char &Separator= '\t', TBool HasTitleLine=false) |
Loads table from spread sheet - but only load the columns specified by RelevantCols. Note: HasTitleLine = true is not supported. Please comment title lines instead. More... | |
static PTable | Load (TSIn &SIn, TTableContext *Context) |
Loads table from a binary format. More... | |
static PTable | LoadShM (TShMIn &ShMIn, TTableContext *Context) |
Static constructor to load table from memory. More... | |
static PTable | TableFromHashMap (const THash< TInt, TInt > &H, const TStr &Col1, const TStr &Col2, TTableContext *Context, const TBool IsStrKeys=false) |
Builds table from hash table of int->int. More... | |
static PTable | TableFromHashMap (const THash< TInt, TFlt > &H, const TStr &Col1, const TStr &Col2, TTableContext *Context, const TBool IsStrKeys=false) |
Builds table from hash table of int->float. More... | |
static PTable | GetNodeTable (const PNEANet &Network, TTableContext *Context) |
Extracts node TTable from PNEANet. More... | |
static PTable | GetEdgeTable (const PNEANet &Network, TTableContext *Context) |
Extracts edge TTable from PNEANet. More... | |
static PTable | GetEdgeTablePN (const PNGraphMP &Network, TTableContext *Context) |
Extracts edge TTable from parallel graph PNGraphMP. More... | |
static PTable | GetFltNodePropertyTable (const PNEANet &Network, const TIntFltH &Property, const TStr &NodeAttrName, const TAttrType &NodeAttrType, const TStr &PropertyAttrName, TTableContext *Context) |
Extracts node and edge property TTables from THash. More... | |
Protected Member Functions | |
void | InvalidatePhysicalGroupings () |
void | InvalidateAffectedGroupings (const TStr &Attr) |
void | IncrementNext () |
Increments the next vector and set last, NumRows and NumValidRows. More... | |
void | ClassifyAux (const TIntV &SelectedRows, const TStr &LabelName, const TInt &PositiveLabel=1, const TInt &NegativeLabel=0) |
Adds a label attribute with positive labels on selected rows and negative labels on the rest. More... | |
const char * | GetContextKey (TInt Val) const |
Gets the Key of the Context StringVals pool. Used by ToGraph method in conv.cpp. More... | |
TStr | GetStrVal (TInt ColIdx, TInt RowIdx) const |
Gets the value in column with id ColIdx at row RowIdx . More... | |
void | AddStrVal (const TInt &ColIdx, const TStr &Val) |
Adds Val in column with id ColIdx . More... | |
void | AddStrVal (const TStr &Col, const TStr &Val) |
Adds Val in column with name Col . More... | |
TStr | GetIdColName () const |
Gets name of the id column of this table. More... | |
TStr | GetSchemaColName (TInt Idx) const |
Gets name of the column with index Idx in the schema. More... | |
TAttrType | GetSchemaColType (TInt Idx) const |
Gets type of the column with index Idx in the schema. More... | |
void | AddSchemaCol (const TStr &ColName, TAttrType ColType) |
Adds column with name ColName and type ColType to the schema. More... | |
TBool | IsColName (const TStr &ColName) const |
void | AddColType (const TStr &ColName, TPair< TAttrType, TInt > ColType) |
Adds column with name ColName and type ColType to the ColTypeMap. More... | |
void | AddColType (const TStr &ColName, TAttrType ColType, TInt Index) |
Adds column with name ColName and type ColType to the ColTypeMap. More... | |
void | DelColType (const TStr &ColName) |
Adds column with name ColName and type ColType to the ColTypeMap. More... | |
TPair< TAttrType, TInt > | GetColTypeMap (const TStr &ColName) const |
Gets column type and index of ColName . More... | |
TStr | RenumberColName (const TStr &ColName) const |
Returns a re-numbered column name based on number of existing columns with conflicting names. More... | |
TStr | DenormalizeColName (const TStr &ColName) const |
Removes suffix to column name if exists. More... | |
Schema | DenormalizeSchema () const |
Removes suffix to column names in the Schema. More... | |
TBool | IsAttr (const TStr &Attr) |
Checks if Attr is an attribute of this table schema. More... | |
void | AddTable (const TTable &T) |
Adds all the rows of the input table. Allows duplicate rows (not a union). More... | |
void | ConcatTable (const PTable &T) |
Appends all rows of T to this table, and recalculate indices. More... | |
void | AddRow (const TRowIterator &RI) |
Adds row corresponding to RI . More... | |
void | AddRow (const TIntV &IntVals, const TFltV &FltVals, const TStrV &StrVals) |
Adds row with values corresponding to the given vectors by type. More... | |
void | AddGraphAttribute (const TStr &Attr, TBool IsEdge, TBool IsSrc, TBool IsDst) |
Adds names of columns to be used as graph attributes. More... | |
void | AddGraphAttributeV (TStrV &Attrs, TBool IsEdge, TBool IsSrc, TBool IsDst) |
Adds vector of names of columns to be used as graph attributes. More... | |
void | CheckAndAddIntNode (PNEANet Graph, THashSet< TInt > &NodeVals, TInt NodeId) |
Checks if given NodeId is seen earlier; if not, add it to Graph and hashmap NodeVals . More... | |
template<class T > | |
TInt | CheckAndAddFltNode (T Graph, THash< TFlt, TInt > &NodeVals, TFlt FNodeVal) |
Checks if given NodeVal is seen earlier; if not, add it to Graph and hashmap NodeVals . More... | |
void | AddEdgeAttributes (PNEANet &Graph, int RowId) |
Adds attributes of edge corresponding to RowId to the Graph . More... | |
void | AddNodeAttributes (TInt NId, TStrV NodeAttrV, TInt RowId, THash< TInt, TStrIntVH > &NodeIntAttrs, THash< TInt, TStrFltVH > &NodeFltAttrs, THash< TInt, TStrStrVH > &NodeStrAttrs) |
Takes as parameters, and updates, maps NodeXAttrs: Node Id –> (attribute name –> Vector of attribute values). More... | |
PNEANet | BuildGraph (const TIntV &RowIds, TAttrAggr AggrPolicy) |
Makes a single pass over the rows in the given row id set, and creates nodes, edges, assigns node and edge attributes. More... | |
void | InitRowIdBuckets (int NumBuckets) |
Initializes the RowIdBuckets vector which will be used for the graph sequence creation. More... | |
void | FillBucketsByWindow (TStr SplitAttr, TInt JumpSize, TInt WindowSize, TInt StartVal, TInt EndVal) |
Fills RowIdBuckets with sets of row ids. More... | |
void | FillBucketsByInterval (TStr SplitAttr, TIntPrV SplitIntervals) |
Fills RowIdBuckets with sets of row ids. More... | |
TVec< PNEANet > | GetGraphsFromSequence (TAttrAggr AggrPolicy) |
Returns a sequence of graphs. More... | |
PNEANet | GetFirstGraphFromSequence (TAttrAggr AggrPolicy) |
Returns the first graph of the sequence. More... | |
PNEANet | GetNextGraphFromSequence () |
Returns the next graph in sequence corresponding to RowIdBuckets. More... | |
template<class T > | |
T | AggregateVector (TVec< T > &V, TAttrAggr Policy) |
Aggregates vector into a single scalar value according to a policy. More... | |
void | GroupingSanityCheck (const TStr &GroupBy, const TAttrType &AttrType) const |
Checks if grouping key exists and matches given attr type. More... | |
template<class T > | |
void | GroupByIntCol (const TStr &GroupBy, T &Grouping, const TIntV &IndexSet, TBool All, TBool UsePhysicalIds=true) const |
Groups/hashes by a single column with integer values. More... | |
template<class T > | |
void | GroupByFltCol (const TStr &GroupBy, T &Grouping, const TIntV &IndexSet, TBool All, TBool UsePhysicalIds=true) const |
Groups/hashes by a single column with float values. Returns hash table with grouping. More... | |
template<class T > | |
void | GroupByStrCol (const TStr &GroupBy, T &Grouping, const TIntV &IndexSet, TBool All, TBool UsePhysicalIds=true) const |
Groups/hashes by a single column with string values. Returns hash table with grouping. More... | |
template<class T > | |
void | UpdateGrouping (THash< T, TIntV > &Grouping, T Key, TInt Val) const |
Template for utility function to update a grouping hash map. More... | |
template<class T > | |
void | UpdateGrouping (THashMP< T, TIntV > &Grouping, T Key, TInt Val) const |
Template for utility function to update a parallel grouping hash map. More... | |
void | PrintGrouping (const THash< TGroupKey, TIntV > &Grouping) const |
TInt | CompareRows (TInt R1, TInt R2, const TAttrType &CompareByType, const TInt &CompareByIndex, TBool Asc=true) |
Returns positive value if R1 is bigger, negative value if R2 is bigger, and 0 if they are equal (strcmp semantics). More... | |
TInt | CompareRows (TInt R1, TInt R2, const TVec< TAttrType > &CompareByTypes, const TIntV &CompareByIndices, TBool Asc=true) |
Returns positive value if R1 is bigger, negative value if R2 is bigger, and 0 if they are equal (strcmp semantics). More... | |
TInt | GetPivot (TIntV &V, TInt StartIdx, TInt EndIdx, const TVec< TAttrType > &SortByTypes, const TIntV &SortByIndices, TBool Asc) |
Gets pivot element for QSort. More... | |
TInt | Partition (TIntV &V, TInt StartIdx, TInt EndIdx, const TVec< TAttrType > &SortByTypes, const TIntV &SortByIndices, TBool Asc) |
Partitions vector for QSort. More... | |
void | ISort (TIntV &V, TInt StartIdx, TInt EndIdx, const TVec< TAttrType > &SortByTypes, const TIntV &SortByIndices, TBool Asc=true) |
Performs insertion sort on given vector V . More... | |
void | QSort (TIntV &V, TInt StartIdx, TInt EndIdx, const TVec< TAttrType > &SortByTypes, const TIntV &SortByIndices, TBool Asc=true) |
Performs QSort on given vector V . More... | |
void | Merge (TIntV &V, TInt Idx1, TInt Idx2, TInt Idx3, const TVec< TAttrType > &SortByTypes, const TIntV &SortByIndices, TBool Asc=true) |
Helper function for parallel QSort. More... | |
void | QSortPar (TIntV &V, const TVec< TAttrType > &SortByTypes, const TIntV &SortByIndices, TBool Asc=true) |
Performs QSort in parallel on given vector V . More... | |
bool | IsRowValid (TInt RowIdx) const |
Checks if RowIdx corresponds to a valid (i.e. not deleted) row. More... | |
TInt | GetLastValidRowIdx () |
Gets the id of the last valid row of the table. More... | |
void | RemoveFirstRow () |
Removes first valid row of the table. More... | |
void | RemoveRow (TInt RowIdx, TInt PrevRowIdx) |
Removes row with id RowIdx . More... | |
void | KeepSortedRows (const TIntV &KeepV) |
Removes all rows that are not mentioned in the SORTED vector KeepV . More... | |
void | SetFirstValidRow () |
Sets the first valid row of the TTable. More... | |
PTable | InitializeJointTable (const TTable &Table) |
Initializes an empty table for the join of this table with the given table. More... | |
void | AddJointRow (const TTable &T1, const TTable &T2, TInt RowIdx1, TInt RowIdx2) |
Adds joint row T1[RowIdx1]<=>T2[RowIdx2]. More... | |
void | ThresholdJoinInputCorrectness (const TStr &KeyCol1, const TStr &JoinCol1, const TTable &Table, const TStr &KeyCol2, const TStr &JoinCol2) |
void | ThresholdJoinCountCollisions (const TTable &TB, const TTable &TS, const TIntIntVH &T, TInt JoinColIdxB, TInt KeyColIdxB, TInt KeyColIdxS, THash< TIntPr, TIntTr > &Counters, TBool ThisIsSmaller, TAttrType JoinColType, TAttrType KeyType) |
PTable | ThresholdJoinOutputTable (const THash< TIntPr, TIntTr > &Counters, TInt Threshold, const TTable &Table) |
void | ThresholdJoinCountPerJoinKeyCollisions (const TTable &TB, const TTable &TS, const TIntIntVH &T, TInt JoinColIdxB, TInt KeyColIdxB, TInt KeyColIdxS, THash< TIntTr, TIntTr > &Counters, TBool ThisIsSmaller, TAttrType JoinColType, TAttrType KeyType) |
PTable | ThresholdJoinPerJoinKeyOutputTable (const THash< TIntTr, TIntTr > &Counters, TInt Threshold, const TTable &Table) |
void | ResizeTable (int RowCount) |
Resizes the table to hold RowCount rows. More... | |
int | GetEmptyRowsStart (int NewRows) |
Gets the start index to a chunk of empty rows of size NewRows . More... | |
void | AddSelectedRows (const TTable &Table, const TIntV &RowIDs) |
Adds rows from Table that correspond to ids in RowIDs . More... | |
void | AddNRows (int NewRows, const TVec< TIntV > &IntColsP, const TVec< TFltV > &FltColsP, const TVec< TIntV > &StrColMapsP) |
Adds NewRows rows from the given vectors for each column type. More... | |
void | AddNJointRowsMP (const TTable &T1, const TTable &T2, const TVec< TIntPrV > &JointRowIDSet) |
Adds rows from T1 and T2 to this table in a parallel manner. Used by Join. More... | |
void | UpdateTableForNewRow () |
Updates table state after adding one or more rows. More... | |
void | GroupAux (const TStrV &GroupBy, THash< TGroupKey, TPair< TInt, TIntV > > &Grouping, TBool Ordered, const TStr &GroupColName, TBool KeepUnique, TIntV &UniqueVec, TBool UsePhysicalIds=true) |
Helper function for grouping. More... | |
void | StoreGroupCol (const TStr &GroupColName, const TVec< TPair< TInt, TInt > > &GroupAndRowIds) |
Parallel helper function for grouping. - we currently don't support such parallel grouping by complex keys. More... | |
void | Reindex () |
Reinitializes row ids. More... | |
void | AddIdColumn (const TStr &IdColName) |
Adds a column of explicit integer identifiers to the rows. More... | |
void | GetCollidingRows (const TTable &T, THashSet< TInt > &Collisions) |
Gets set of row ids of rows common with table T . More... | |
Static Protected Member Functions | |
static void | LoadSSPar (PTable &NewTable, const Schema &S, const TStr &InFNm, const TIntV &RelevantCols, const char &Separator, TBool HasTitleLine) |
Parallelly loads data from input file at InFNm into NewTable. Only work when NewTable has no string columns. More... | |
static void | LoadSSSeq (PTable &NewTable, const Schema &S, const TStr &InFNm, const TIntV &RelevantCols, const char &Separator, TBool HasTitleLine) |
Sequentially loads data from input file at InFNm into NewTable. More... | |
static TInt | CompareKeyVal (const TInt &K1, const TInt &V1, const TInt &K2, const TInt &V2) |
static TInt | CheckSortedKeyVal (TIntV &Key, TIntV &Val, TInt Start, TInt End) |
static void | ISortKeyVal (TIntV &Key, TIntV &Val, TInt Start, TInt End) |
static TInt | GetPivotKeyVal (TIntV &Key, TIntV &Val, TInt Start, TInt End) |
static TInt | PartitionKeyVal (TIntV &Key, TIntV &Val, TInt Start, TInt End) |
static void | QSortKeyVal (TIntV &Key, TIntV &Val, TInt Start, TInt End) |
Protected Attributes | |
TTableContext * | Context |
Execution Context. More... | |
Schema | Sch |
Table Schema. More... | |
TCRef | CRef |
TInt | NumRows |
Number of rows in the table (valid and invalid). More... | |
TInt | NumValidRows |
Number of valid rows in the table (i.e. rows that were not logically removed). More... | |
TInt | FirstValidRow |
Physical index of first valid row. More... | |
TInt | LastValidRow |
Physical index of last valid row. More... | |
TIntV | Next |
A vector describing the logical order of the rows. More... | |
TVec< TIntV > | IntCols |
Next [i] is the successor of row i . Table iterators follow the order dictated by Next More... | |
TVec< TFltV > | FltCols |
Data columns of floating point attributes. More... | |
TVec< TIntV > | StrColMaps |
Data columns of integer mappings of string attributes. More... | |
THash< TStr, TPair< TAttrType, TInt > > | ColTypeMap |
TStr | IdColName |
A mapping from column name to column type and column index among columns of the same type. More... | |
TIntIntH | RowIdMap |
Mapping of permanent row ids to physical id. More... | |
THash< TStr, THash< TInt, TIntV > > | IntColIndexes |
Indexes for Int Columns. More... | |
THash< TStr, THash< TInt, TIntV > > | StrMapColIndexes |
Indexes for String Columns. More... | |
THash< TStr, THash< TFlt, TIntV > > | FltColIndexes |
Indexes for Float Columns. More... | |
THash< TStr, GroupStmt > | GroupStmtNames |
Maps user-given grouping statement names to their group-by attributes. More... | |
THash< GroupStmt, THash< TInt, TGroupKey > > | GroupIDMapping |
Maps grouping statements to their (group id –> group-by key) mapping. More... | |
THash< GroupStmt, THash < TGroupKey, TIntV > > | GroupMapping |
Maps grouping statements to their (group-by key –> group id) mapping. More... | |
TStr | SrcCol |
Column (attribute) to serve as src nodes when constructing the graph. More... | |
TStr | DstCol |
Column (attribute) to serve as dst nodes when constructing the graph. More... | |
TStrV | EdgeAttrV |
List of columns (attributes) to serve as edge attributes. More... | |
TStrV | SrcNodeAttrV |
List of columns (attributes) to serve as source node attributes. More... | |
TStrV | DstNodeAttrV |
List of columns (attributes) to serve as destination node attributes. More... | |
TStrTrV | CommonNodeAttrs |
List of attribute pairs with values common to source and destination and their common given name. More... | |
TVec< TIntV > | RowIdBuckets |
Partitioning of row ids into buckets corresponding to different graph objects when generating a sequence of graphs. More... | |
TInt | CurrBucket |
Current row id bucket - used when generating a sequence of graphs using an iterator. More... | |
TAttrAggr | AggrPolicy |
Aggregation policy used for solving conflicts between different values of an attribute of the same node. More... | |
TInt | IsNextDirty |
Flag to signify whether the rows are stored in logical sequence or reordered. Used for optimizing GetPartitionRanges. More... | |
Static Protected Attributes | |
static const TInt | Last = -1 |
Special value for Next vector entry - last row in table. More... | |
static const TInt | Invalid = -2 |
Special value for Next vector entry - logically removed row. More... | |
static TInt | UseMP = 1 |
Global switch for choosing multi-threaded versions of TTable functions. More... | |
Private Member Functions | |
void | GenerateColTypeMap (THash< TStr, TPair< TInt, TInt > > &ColTypeIntMap) |
void | LoadTableShM (TShMIn &ShMIn, TTableContext *ContextTable) |
Friends | |
class | TPt< TTable > |
class | TRowIterator |
class | TRowIteratorWithRemove |
template<class PGraph > | |
PGraph | TSnap::ToGraph (PTable Table, const TStr &SrcCol, const TStr &DstCol, TAttrAggr AggrPolicy) |
template<class PGraph > | |
PGraph | TSnap::ToNetwork (PTable Table, const TStr &SrcCol, const TStr &DstCol, TStrV &SrcAttrs, TStrV &DstAttrs, TStrV &EdgeAttrs, TAttrAggr AggrPolicy) |
template<class PGraph > | |
PGraph | TSnap::ToNetwork (PTable Table, const TStr &SrcCol, const TStr &DstCol, TAttrAggr AggrPolicy) |
template<class PGraph > | |
PGraph | TSnap::ToNetwork (PTable Table, const TStr &SrcCol, const TStr &DstCol, TStrV &EdgeAttrV, TAttrAggr AggrPolicy) |
template<class PGraph > | |
PGraph | TSnap::ToNetwork (PTable Table, const TStr &SrcCol, const TStr &DstCol, TStrV &EdgeAttrV, PTable NodeTable, const TStr &NodeCol, TStrV &NodeAttrV, TAttrAggr AggrPolicy) |
int | TSnap::LoadCrossNet (TCrossNet &Graph, PTable Table, const TStr &SrcCol, const TStr &DstCol, TStrV &EdgeAttrV) |
int | TSnap::LoadMode (TModeNet &Graph, PTable Table, const TStr &NCol, TStrV &NodeAttrV) |
template<class PGraphMP > | |
PGraphMP | TSnap::ToGraphMP (PTable Table, const TStr &SrcCol, const TStr &DstCol) |
template<class PGraphMP > | |
PGraphMP | TSnap::ToGraphMP3 (PTable Table, const TStr &SrcCol, const TStr &DstCol) |
template<class PGraphMP > | |
PGraphMP | TSnap::ToNetworkMP (PTable Table, const TStr &SrcCol, const TStr &DstCol, TStrV &SrcAttrs, TStrV &DstAttrs, TStrV &EdgeAttrs, TAttrAggr AggrPolicy) |
template<class PGraphMP > | |
PGraphMP | TSnap::ToNetworkMP2 (PTable Table, const TStr &SrcCol, const TStr &DstCol, TStrV &SrcAttrs, TStrV &DstAttrs, TStrV &EdgeAttrs, TAttrAggr AggrPolicy) |
template<class PGraphMP > | |
PGraphMP | TSnap::ToNetworkMP (PTable Table, const TStr &SrcCol, const TStr &DstCol, TStrV &EdgeAttrV, TAttrAggr AggrPolicy) |
template<class PGraphMP > | |
PGraphMP | TSnap::ToNetworkMP (PTable Table, const TStr &SrcCol, const TStr &DstCol, TAttrAggr AggrPolicy) |
template<class PGraphMP > | |
PGraphMP | TSnap::ToNetworkMP (PTable Table, const TStr &SrcCol, const TStr &DstCol, TStrV &EdgeAttrV, PTable NodeTable, const TStr &NodeCol, TStrV &NodeAttrV, TAttrAggr AggrPolicy) |
TTable::TTable | ( | ) |
Definition at line 302 of file table.cpp.
TTable::TTable | ( | TTableContext * | Context | ) |
Definition at line 305 of file table.cpp.
TTable::TTable | ( | const Schema & | S, |
TTableContext * | Context | ||
) |
Definition at line 308 of file table.cpp.
TTable::TTable | ( | TSIn & | SIn, |
TTableContext * | Context | ||
) |
Definition at line 378 of file table.cpp.
TTable::TTable | ( | const THash< TInt, TInt > & | H, |
const TStr & | Col1, | ||
const TStr & | Col2, | ||
TTableContext * | Context, | ||
const TBool | IsStrKeys = false |
||
) |
Constructor to build table out of a hash table of int->int.
Definition at line 385 of file table.cpp.
TTable::TTable | ( | const THash< TInt, TFlt > & | H, |
const TStr & | Col1, | ||
const TStr & | Col2, | ||
TTableContext * | Context, | ||
const TBool | IsStrKeys = false |
||
) |
Constructor to build table out of a hash table of int->float.
Definition at line 412 of file table.cpp.
|
inline |
Copy constructor.
Definition at line 919 of file table.h.
Definition at line 438 of file table.cpp.
Adds column with name ColName
and type ColType
to the ColTypeMap.
Definition at line 651 of file table.h.
Adds column with name ColName
and type ColType
to the ColTypeMap.
Definition at line 656 of file table.h.
|
inline |
Adds column to be used as dst node atribute of the graph.
Definition at line 1180 of file table.h.
|
inline |
Adds columns to be used as dst node attributes of the graph.
Definition at line 1182 of file table.h.
|
inline |
Adds column to be used as graph edge attribute.
Definition at line 1172 of file table.h.
|
inline |
Adds columns to be used as graph edge attributes.
Definition at line 1174 of file table.h.
|
inlineprotected |
Adds attributes of edge corresponding to RowId
to the Graph
.
Definition at line 3395 of file table.cpp.
void TTable::AddFltCol | ( | const TStr & | ColName | ) |
Adds a float column with name ColName
.
Definition at line 4680 of file table.cpp.
|
protected |
Adds names of columns to be used as graph attributes.
Definition at line 985 of file table.cpp.
Adds vector of names of columns to be used as graph attributes.
Definition at line 992 of file table.cpp.
|
protected |
Adds a column of explicit integer identifiers to the rows.
Definition at line 1900 of file table.cpp.
void TTable::AddIntCol | ( | const TStr & | ColName | ) |
Adds an integer column with name ColName
.
Definition at line 4673 of file table.cpp.
|
protected |
Adds joint row T1[RowIdx1]<=>T2[RowIdx2].
Definition at line 1957 of file table.cpp.
|
protected |
Adds rows from T1 and T2 to this table in a parallel manner. Used by Join.
Definition at line 4442 of file table.cpp.
|
inline |
Handles the common case where src and dst both belong to the same "universe" of entities.
Definition at line 1184 of file table.h.
|
inline |
Handles the common case where src and dst both belong to the same "universe" of entities.
Definition at line 1186 of file table.h.
|
inlineprotected |
Takes as parameters, and updates, maps NodeXAttrs: Node Id –> (attribute name –> Vector of attribute values).
Definition at line 3414 of file table.cpp.
|
protected |
Adds NewRows
rows from the given vectors for each column type.
Definition at line 4421 of file table.cpp.
|
protected |
Adds row corresponding to RI
.
Definition at line 4295 of file table.cpp.
|
protected |
Adds row with values corresponding to the given vectors by type.
Definition at line 4317 of file table.cpp.
|
inline |
Adds row with values taken from given TTableRow.
Definition at line 1002 of file table.h.
Adds column with name ColName
and type ColType
to the schema.
Definition at line 642 of file table.h.
Adds rows from Table
that correspond to ids in RowIDs
.
Definition at line 4399 of file table.cpp.
|
inline |
Adds column to be used as src node atribute of the graph.
Definition at line 1176 of file table.h.
|
inline |
Adds columns to be used as src node attributes of the graph.
Definition at line 1178 of file table.h.
void TTable::AddStrCol | ( | const TStr & | ColName | ) |
Adds a string column with name ColName
.
Definition at line 4687 of file table.cpp.
Adds Val
in column with id ColIdx
.
Definition at line 971 of file table.cpp.
Adds Val
in column with name Col
.
Definition at line 977 of file table.cpp.
|
protected |
Adds all the rows of the input table. Allows duplicate rows (not a union).
Definition at line 3975 of file table.cpp.
void TTable::Aggregate | ( | const TStrV & | GroupByAttrs, |
TAttrAggr | AggOp, | ||
const TStr & | ValAttr, | ||
const TStr & | ResAttr, | ||
TBool | Ordered = true |
||
) |
Aggregates values of ValAttr after grouping with respect to GroupByAttrs. Result are stored as new attribute ResAttr.
Definition at line 1585 of file table.cpp.
Aggregates attributes in AggrAttrs across columns.
Definition at line 1750 of file table.cpp.
Aggregates vector into a single scalar value according to a policy.
Aggregate vector into a single scalar value according to a policy. Used for choosing an attribute value for a node when this node appears in several records and has conflicting attribute values
Definition at line 1544 of file table.h.
|
inline |
|
inline |
Makes a single pass over the rows in the given row id set, and creates nodes, edges, assigns node and edge attributes.
Definition at line 3445 of file table.cpp.
TTableContext * TTable::ChangeContext | ( | TTableContext * | Context | ) |
Changes the current context. Moves all object items to the new context.
Definition at line 921 of file table.cpp.
|
protected |
|
inlineprotected |
Definition at line 5310 of file table.cpp.
void TTable::Classify | ( | TPredicate & | Predicate, |
const TStr & | LabelName, | ||
const TInt & | PositiveLabel = 1 , |
||
const TInt & | NegativeLabel = 0 |
||
) |
Definition at line 2805 of file table.cpp.
void TTable::ClassifyAtomic | ( | const TStr & | Col1, |
const TStr & | Col2, | ||
TPredComp | Cmp, | ||
const TStr & | LabelName, | ||
const TInt & | PositiveLabel = 1 , |
||
const TInt & | NegativeLabel = 0 |
||
) |
Definition at line 2866 of file table.cpp.
|
inline |
Definition at line 1301 of file table.h.
|
protected |
Adds a label attribute with positive labels on selected rows and negative labels on the rest.
Definition at line 4694 of file table.cpp.
Performs columnwise addition. See TTable::ColGenericOp.
Definition at line 4816 of file table.cpp.
void TTable::ColAdd | ( | const TStr & | Attr1, |
TTable & | Table, | ||
const TStr & | Attr2, | ||
const TStr & | ResAttr = "" , |
||
TBool | AddToFirstTable = true |
||
) |
Performs columnwise addition with column of given table.
Definition at line 4949 of file table.cpp.
void TTable::ColAdd | ( | const TStr & | Attr1, |
const TFlt & | Num, | ||
const TStr & | ResultAttrName = "" , |
||
const TBool | floatCast = false |
||
) |
Performs addition of column values and given Num
.
Definition at line 5063 of file table.cpp.
void TTable::ColConcat | ( | const TStr & | Attr1, |
const TStr & | Attr2, | ||
const TStr & | Sep = "" , |
||
const TStr & | ResAttr = "" |
||
) |
Concatenates two string columns.
Definition at line 5083 of file table.cpp.
void TTable::ColConcat | ( | const TStr & | Attr1, |
TTable & | Table, | ||
const TStr & | Attr2, | ||
const TStr & | Sep = "" , |
||
const TStr & | ResAttr = "" , |
||
TBool | AddToFirstTable = true |
||
) |
Concatenates string column with column of given table.
Definition at line 5117 of file table.cpp.
void TTable::ColConcatConst | ( | const TStr & | Attr1, |
const TStr & | Val, | ||
const TStr & | Sep = "" , |
||
const TStr & | ResAttr = "" |
||
) |
Concatenates column values with given string value.
Definition at line 5182 of file table.cpp.
Performs columnwise division. See TTable::ColGenericOp.
Definition at line 4828 of file table.cpp.
void TTable::ColDiv | ( | const TStr & | Attr1, |
TTable & | Table, | ||
const TStr & | Attr2, | ||
const TStr & | ResAttr = "" , |
||
TBool | AddToFirstTable = true |
||
) |
Performs columnwise division with column of given table.
Definition at line 4964 of file table.cpp.
void TTable::ColDiv | ( | const TStr & | Attr1, |
const TFlt & | Num, | ||
const TStr & | ResultAttrName = "" , |
||
const TBool | floatCast = false |
||
) |
Performs division of column values and given Num
.
Definition at line 5075 of file table.cpp.
void TTable::ColGenericOp | ( | const TStr & | Attr1, |
const TStr & | Attr2, | ||
const TStr & | ResAttr, | ||
TArithOp | op | ||
) |
Performs columnwise arithmetic operation.
Performs Attr1 OP Attr2 and stores it in Attr1 If ResAttr != "", result is stored in a new column ResAttr
Definition at line 4752 of file table.cpp.
void TTable::ColGenericOp | ( | const TStr & | Attr1, |
TTable & | Table, | ||
const TStr & | Attr2, | ||
const TStr & | ResAttr, | ||
TArithOp | op, | ||
TBool | AddToFirstTable | ||
) |
Performs columnwise arithmetic operation with column of given table.
Definition at line 4844 of file table.cpp.
void TTable::ColGenericOp | ( | const TStr & | Attr1, |
const TFlt & | Num, | ||
const TStr & | ResAttr, | ||
TArithOp | op, | ||
const TBool | floatCast | ||
) |
Performs arithmetic op of column values and given Num
.
Definition at line 4975 of file table.cpp.
void TTable::ColGenericOpMP | ( | TInt | ArgColIdx1, |
TInt | ArgColIdx2, | ||
TAttrType | ArgType1, | ||
TAttrType | ArgType2, | ||
TInt | ResColIdx, | ||
TArithOp | op | ||
) |
Definition at line 4708 of file table.cpp.
void TTable::ColGenericOpMP | ( | const TInt & | ColIdx1, |
const TInt & | ColIdx2, | ||
TAttrType | ArgType, | ||
const TFlt & | Num, | ||
TArithOp | op, | ||
TBool | ShouldCast | ||
) |
Definition at line 5032 of file table.cpp.
Performs max of two columns. See TTable::ColGenericOp.
Definition at line 4840 of file table.cpp.
Performs min of two columns. See TTable::ColGenericOp.
Definition at line 4836 of file table.cpp.
Performs columnwise modulus. See TTable::ColGenericOp.
Definition at line 4832 of file table.cpp.
void TTable::ColMod | ( | const TStr & | Attr1, |
TTable & | Table, | ||
const TStr & | Attr2, | ||
const TStr & | ResAttr = "" , |
||
TBool | AddToFirstTable = true |
||
) |
Performs columnwise modulus with column of given table.
Definition at line 4969 of file table.cpp.
void TTable::ColMod | ( | const TStr & | Attr1, |
const TFlt & | Num, | ||
const TStr & | ResultAttrName = "" , |
||
const TBool | floatCast = false |
||
) |
Performs modulus of column values and given Num
.
Definition at line 5079 of file table.cpp.
Performs columnwise multiplication. See TTable::ColGenericOp.
Definition at line 4824 of file table.cpp.
void TTable::ColMul | ( | const TStr & | Attr1, |
TTable & | Table, | ||
const TStr & | Attr2, | ||
const TStr & | ResAttr = "" , |
||
TBool | AddToFirstTable = true |
||
) |
Performs columnwise multiplication with column of given table.
Definition at line 4959 of file table.cpp.
void TTable::ColMul | ( | const TStr & | Attr1, |
const TFlt & | Num, | ||
const TStr & | ResultAttrName = "" , |
||
const TBool | floatCast = false |
||
) |
Performs multiplication of column values and given Num
.
Definition at line 5071 of file table.cpp.
Performs columnwise subtraction. See TTable::ColGenericOp.
Definition at line 4820 of file table.cpp.
void TTable::ColSub | ( | const TStr & | Attr1, |
TTable & | Table, | ||
const TStr & | Attr2, | ||
const TStr & | ResAttr = "" , |
||
TBool | AddToFirstTable = true |
||
) |
Performs columnwise subtraction with column of given table.
Definition at line 4954 of file table.cpp.
void TTable::ColSub | ( | const TStr & | Attr1, |
const TFlt & | Num, | ||
const TStr & | ResultAttrName = "" , |
||
const TBool | floatCast = false |
||
) |
Performs subtraction of column values and given Num
.
Definition at line 5067 of file table.cpp.
|
staticprotected |
|
inlineprotected |
Returns positive value if R1 is bigger, negative value if R2 is bigger, and 0 if they are equal (strcmp semantics).
Definition at line 3064 of file table.cpp.
|
inlineprotected |
Returns positive value if R1 is bigger, negative value if R2 is bigger, and 0 if they are equal (strcmp semantics).
Definition at line 3088 of file table.cpp.
|
inlineprotected |
Appends all rows of T
to this table, and recalculate indices.
Definition at line 683 of file table.h.
Counts number of unique elements.
Count the number of appearences of the different elements of column . Record results in column CountCol
Definition at line 1802 of file table.cpp.
void TTable::Defrag | ( | ) |
Releases memory of deleted rows, and defrags.
Also updates meta-data as row indices have changed Need some liveness analysis of columns
Definition at line 3311 of file table.cpp.
|
inlineprotected |
Adds column with name ColName
and type ColType
to the ColTypeMap.
Definition at line 661 of file table.h.
|
protected |
Removes suffix to column names in the Schema.
Definition at line 4665 of file table.cpp.
void TTable::Dump | ( | FILE * | OutF = stdout | ) | const |
Prints table contents to a text file.
Definition at line 887 of file table.cpp.
|
inline |
Gets iterator to the last valid row of the table.
Definition at line 1243 of file table.h.
|
inline |
Gets iterator with reomve to the last valid row.
Definition at line 1247 of file table.h.
Fills RowIdBuckets with sets of row ids.
Fill RowIdBuckets with sets of row ids, partitioned on the value of the column SplitAttr, according to the intervals specified by SplitIntervals. Called by ToVarGraphSequence and ToVarGraphSequenceIterator.
Definition at line 3599 of file table.cpp.
|
protected |
Fills RowIdBuckets with sets of row ids.
Fill RowIdBuckets with sets of row ids partitioned on the value of the column SplitAttr, according to the windows specified by JumpSize and WindowSize. Called by ToGraphSequence and ToGraphSequenceIterator.
Definition at line 3547 of file table.cpp.
Definition at line 337 of file table.cpp.
Gets index of column ColName
among columns of the same type in the schema.
Definition at line 1013 of file table.h.
Gets set of row ids of rows common with table T
.
Definition at line 4014 of file table.cpp.
Gets type of column ColName
.
Definition at line 1227 of file table.h.
Gets column type and index of ColName
.
Definition at line 666 of file table.h.
|
inline |
|
inlineprotected |
Gets the Key of the Context StringVals pool. Used by ToGraph method in conv.cpp.
Definition at line 622 of file table.h.
TSize TTable::GetContextMemUsedKB | ( | ) |
Returns approximate memory used by table context in [KB].
Definition at line 3969 of file table.cpp.
|
inline |
Gets the name of the column to be used as dst nodes in the graph.
Definition at line 1165 of file table.h.
TStrV TTable::GetDstNodeFltAttrV | ( | ) | const |
Gets dst node float attribute name vector.
Definition at line 1049 of file table.cpp.
TStrV TTable::GetDstNodeIntAttrV | ( | ) | const |
Gets dst node int attribute name vector.
Definition at line 1016 of file table.cpp.
TStrV TTable::GetDstNodeStrAttrV | ( | ) | const |
Gets dst node str attribute name vector.
Definition at line 1082 of file table.cpp.
TStrV TTable::GetEdgeFltAttrV | ( | ) | const |
Gets edge float attribute name vector.
Definition at line 1060 of file table.cpp.
TStrV TTable::GetEdgeIntAttrV | ( | ) | const |
Gets edge int attribute name vector.
Definition at line 1027 of file table.cpp.
TStrV TTable::GetEdgeStrAttrV | ( | ) | const |
Gets edge str attribute name vector.
Definition at line 1094 of file table.cpp.
|
static |
Extracts edge TTable from PNEANet.
Definition at line 3741 of file table.cpp.
|
static |
Extracts edge TTable from parallel graph PNGraphMP.
Definition at line 3799 of file table.cpp.
|
protected |
Gets the start index to a chunk of empty rows of size NewRows
.
Definition at line 4376 of file table.cpp.
Returns the first graph of the sequence.
Return the first graph of the sequence corresponding to the sets of row ids in RowIdBuckets. This is used by the ToGraph*Iterator functions.
Definition at line 3628 of file table.cpp.
|
static |
Extracts node and edge property TTables from THash.
Definition at line 3852 of file table.cpp.
Gets the rows containing Val in flt column ColName
.
Returns the RowIdxs in the float column given by ColName which have value Val, as a Vector. (If no such value is found, returns an empty vector.) Uses an index created by RequestIndex method if it exists, else loops over the entire table (which can be slow, so it is recommended to request an index if multiple queries must be made).
Definition at line 5453 of file table.cpp.
Gets the value of float attribute ColName
at row RowIdx
.
Definition at line 1024 of file table.h.
Get the float value at column ColIdx
and row RowIdx
.
Definition at line 1120 of file table.h.
Returns a sequence of graphs.
Return a sequence of graphs, each constructed from the set of row ids corresponding to a particular bucket in RowIdBuckets.
Definition at line 3616 of file table.cpp.
|
inlineprotected |
Gets name of the id column of this table.
Definition at line 636 of file table.h.
Gets the rows containing Val in int column ColName
.
Returns the RowIdxs in the integer column given by ColName which have value Val, as a Vector. (If no such value is found, returns an empty vector.) Uses an index created by RequestIndex method if it exists, else loops over the entire table (which can be slow, so it is recommended to request an index if multiple queries must be made).
Definition at line 5410 of file table.cpp.
Gets the value of integer attribute ColName
at row RowIdx
.
Definition at line 1020 of file table.h.
Get the integer value at column ColIdx
and row RowIdx
.
Definition at line 1116 of file table.h.
|
protected |
Gets the id of the last valid row of the table.
TSize TTable::GetMemUsedKB | ( | ) |
Returns approximate memory used by table in [KB].
Definition at line 3940 of file table.cpp.
|
inlinestatic |
Definition at line 527 of file table.h.
|
protected |
Returns the next graph in sequence corresponding to RowIdBuckets.
Returns the next graph in sequence corresponding to RowIdBuckets. This is used to iterate over the graph sequence by constructing one graph at a time. Called by NextGraphIterator().
Definition at line 3634 of file table.cpp.
|
static |
Extracts node TTable from PNEANet.
Definition at line 3689 of file table.cpp.
|
inline |
|
inline |
Gets number of valid, i.e. not deleted, rows in this table.
Definition at line 1234 of file table.h.
Partitions the table into NumPartitions
and populate Partitions
with the ranges.
Definition at line 1177 of file table.cpp.
|
protected |
Gets pivot element for QSort.
Definition at line 3110 of file table.cpp.
Definition at line 5338 of file table.cpp.
Gets a map of logical to physical row ids.
Definition at line 1237 of file table.h.
Returns pointer to a new table created from given Table
, with name set to TableName
.
Automatically detects the Schema of a input file (data is assumed to be in tsv format)
Definition at line 455 of file table.cpp.
|
inline |
Gets the schema of this table.
Definition at line 1125 of file table.h.
|
inline |
Gets the name of the column to be used as src nodes in the graph.
Definition at line 1158 of file table.h.
TStrV TTable::GetSrcNodeFltAttrV | ( | ) | const |
Gets src node float attribute name vector.
Definition at line 1038 of file table.cpp.
TStrV TTable::GetSrcNodeIntAttrV | ( | ) | const |
Gets src node int attribute name vector.
Definition at line 1005 of file table.cpp.
TStrV TTable::GetSrcNodeStrAttrV | ( | ) | const |
Gets src node str attribute name vector.
Definition at line 1071 of file table.cpp.
Gets the string with KeyId
.
Definition at line 1109 of file table.h.
Gets the integer mapping of the string at column ColIdx
at row RowIdx
.
Definition at line 1033 of file table.h.
Gets the integer mapping of the string at column ColName
at row RowIdx
.
Definition at line 1038 of file table.h.
Gets the rows containing int mapping Map in str column ColName
.
Returns the RowIdxs in the string column given by ColName which have the string with integer mapping Map, as a Vector. (If no such value is found, returns an empty vector.) Uses an index created by RequestIndex method if it exists, else loops over the entire table (which can be slow, so it is recommended to request an index if multiple queries must be made).
Definition at line 5431 of file table.cpp.
Gets the value in column with id ColIdx
at row RowIdx
.
Definition at line 626 of file table.h.
Gets the value of string attribute ColName
at row RowIdx
.
Definition at line 1028 of file table.h.
Gets the value of the string attribute at column ColIdx
at row RowIdx
.
Definition at line 1043 of file table.h.
Gets the value of the string attribute at column ColName
at row RowIdx
.
Definition at line 1048 of file table.h.
void TTable::Group | ( | const TStrV & | GroupBy, |
const TStr & | GroupColName, | ||
TBool | Ordered = true , |
||
TBool | UsePhysicalIds = true |
||
) |
Groups rows depending on values of GroupBy
columns.
Specify columns to group by, name of column in new table, whether to treat columns as ordered If name of column is an empty string, no column is created
Definition at line 1569 of file table.cpp.
|
protected |
Helper function for grouping.
If KeepUnique is true, UniqueVec will be modified to contain a row from each group If KeepUnique is false, then normal grouping is done and a new column is added depending on whether GroupColName is empty
Definition at line 1322 of file table.cpp.
|
protected |
Groups/hashes by a single column with float values. Returns hash table with grouping.
Definition at line 1626 of file table.h.
|
protected |
Groups/hashes by a single column with integer values.
Group/hash by a single column with integer values. Returns hash table with grouping. IndexSet tells what rows to consider (vector of physical row ids). It is used only if All == true. Note that the IndexSet option is currently not used anywhere.
Definition at line 1598 of file table.h.
void TTable::GroupByIntColMP | ( | const TStr & | GroupBy, |
THashMP< TInt, TIntV > & | Grouping, | ||
TBool | UsePhysicalIds = true |
||
) | const |
Groups/hashes by a single column with integer values, using OpenMP multi-threading.
Definition at line 1225 of file table.cpp.
|
protected |
Groups/hashes by a single column with string values. Returns hash table with grouping.
Definition at line 1653 of file table.h.
|
protected |
Checks if grouping key exists and matches given attr type.
Definition at line 1215 of file table.cpp.
|
protected |
Increments the next vector and set last, NumRows and NumValidRows.
Definition at line 2255 of file table.cpp.
Initializes an empty table for the join of this table with the given table.
Definition at line 1916 of file table.cpp.
void TTable::InitIds | ( | ) |
Adds explicit row ids, initialize hash set mapping ids to physical rows.
Definition at line 1883 of file table.cpp.
|
protected |
Initializes the RowIdBuckets vector which will be used for the graph sequence creation.
Definition at line 3535 of file table.cpp.
Returns intersection of this table with given Table
.
Definition at line 4567 of file table.cpp.
Definition at line 1422 of file table.h.
|
protected |
|
protected |
Definition at line 646 of file table.h.
TBool TTable::IsLastGraphOfSequence | ( | ) |
Checks if the end of the graph sequence is reached.
Definition at line 3685 of file table.cpp.
PTable TTable::IsNextK | ( | const TStr & | OrderCol, |
TInt | K, | ||
const TStr & | GroupBy, | ||
const TStr & | RankColName = "" |
||
) |
Distance based filter.
Creates a table T' where the rows are joint rows (T[r1],T[r2]) such that r2 is one of the successive rows to r1 when this table is ordered by OrderCol, and both r1 and r2 have the same value of GroupBy column
Definition at line 3891 of file table.cpp.
|
protected |
Performs insertion sort on given vector V
.
Definition at line 3096 of file table.cpp.
Definition at line 5321 of file table.cpp.
|
inlineprotected |
Checks if RowIdx
corresponds to a valid (i.e. not deleted) row.
Definition at line 801 of file table.h.
Performs equijoin.
Perform equi-join with given columns - i.e. keep tuple pairs where this->Col1 == Table->Col2 Implementation: Hash-Join - build a hash out of the smaller table hash the larger table and check for collisions
Definition at line 2272 of file table.cpp.
Definition at line 1360 of file table.h.
|
protected |
Removes all rows that are not mentioned in the SORTED vector KeepV
.
Definition at line 1152 of file table.cpp.
|
inlinestatic |
Loads table from a binary format.
TTableContext Context
must be provided as a parameter and loaded separately from a table load as it can be shared among multiple tables. Context
can be loaded either before and after the table load, but must be available for operations that require string values (as opposed to string references).
Definition at line 971 of file table.h.
|
inlinestatic |
Static constructor to load table from memory.
Cannot perform operations that edit the edge vectors of nodes or perform illegal operations on any internal hashes (deletion or swapping keys)
Definition at line 975 of file table.h.
|
static |
Loads table from spread sheet (TSV, CSV, etc). Note: HasTitleLine = true is not supported. Please comment title lines instead.
Definition at line 795 of file table.cpp.
|
static |
Loads table from spread sheet - but only load the columns specified by RelevantCols. Note: HasTitleLine = true is not supported. Please comment title lines instead.
Definition at line 757 of file table.cpp.
|
staticprotected |
Parallelly loads data from input file at InFNm into NewTable. Only work when NewTable has no string columns.
Definition at line 507 of file table.cpp.
|
staticprotected |
Sequentially loads data from input file at InFNm into NewTable.
Definition at line 669 of file table.cpp.
|
private |
Definition at line 360 of file table.cpp.
|
protected |
Helper function for parallel QSort.
Definition at line 3178 of file table.cpp.
Returns table with rows that are present in this table but not in given Table
.
Definition at line 4592 of file table.cpp.
Definition at line 1425 of file table.h.
|
inlinestatic |
|
inlinestatic |
|
inlinestatic |
|
inlinestatic |
PNEANet TTable::NextGraphIterator | ( | ) |
Calls to this must be preceded by a call to one of the above ToGraph*Iterator functions.
Definition at line 3681 of file table.cpp.
Adds suffix to column name if it doesn't exist.
Definition at line 539 of file table.h.
void TTable::Order | ( | const TStrV & | OrderBy, |
TStr | OrderColName = "" , |
||
TBool | ResetRankByMSC = false , |
||
TBool | Asc = true |
||
) |
Orders the rows according to the values in columns of OrderBy (in descending lexicographic order).
Definition at line 3240 of file table.cpp.
|
protected |
Partitions vector for QSort.
Definition at line 3126 of file table.cpp.
Definition at line 5355 of file table.cpp.
void TTable::PrintContextSize | ( | ) |
Definition at line 3959 of file table.cpp.
void TTable::PrintSize | ( | ) |
Definition at line 3930 of file table.cpp.
Returns table with only the columns in ProjectCols
.
Definition at line 4615 of file table.cpp.
void TTable::ProjectInPlace | ( | const TStrV & | ProjectCols | ) |
Keeps only the columns specified in ProjectCols
.
Definition at line 5239 of file table.cpp.
|
protected |
Performs QSort on given vector V
.
Definition at line 3154 of file table.cpp.
Definition at line 5378 of file table.cpp.
|
protected |
Performs QSort in parallel on given vector V
.
Definition at line 3206 of file table.cpp.
Reads values of entire float column into Result
.
Definition at line 5221 of file table.cpp.
Reads values of entire int column into Result
.
Definition at line 5212 of file table.cpp.
Reads values of entire string column into Result
.
Definition at line 5230 of file table.cpp.
|
protected |
Reinitializes row ids.
Register (cache) result of a grouping statement by a single group-by attribute T is a hash table mapping a key x to rows keyed by x => DISABLED FOR NOW
Definition at line 1889 of file table.cpp.
|
protected |
Removes first valid row of the table.
Definition at line 1122 of file table.cpp.
Removes row with id RowIdx
.
Definition at line 1135 of file table.cpp.
Renames a column.
Definition at line 1105 of file table.cpp.
Creates Index for Flt Column ColName
.
Creates an Index on float column ColName. The index is hash-based, going from the column value to a vector of RowIdxs in the table that correspond to the value. If it exists, the index is used by the Get*RowIdxByVal functions; else, those functions will loop over the entire table. The index is NOT updated automatically when the table is modified; it is the user's responsibility to call RequestIndex after modifying the table if the index is necessary.
Definition at line 5495 of file table.cpp.
Creates Index for Int Column ColName
.
Creates an Index on integer column ColName. The index is hash-based, going from the column value to a vector of RowIdxs in the table that correspond to the value. If it exists, the index is used by the Get*RowIdxByVal functions; else, those functions will loop over the entire table. The index is NOT updated automatically when the table is modified; it is the user's responsibility to call RequestIndex after modifying the table if the index is necessary.
Definition at line 5476 of file table.cpp.
Creates Index for Str Column ColName
.
Creates an Index on string column given by ColName. The index is hash-based, going from the column value (that is, the integer mapping of the string value) to a vector of RowIdxs in the table that correspond to the value. If it exists, the index is used by the Get*RowIdxByVal functions; else, those functions will loop over the entire table. The index is NOT updated automatically when the table is modified; it is the user's responsibility to call RequestIndex after modifying the table if the index is necessary.
Definition at line 5514 of file table.cpp.
|
protected |
Resizes the table to hold RowCount
rows.
Definition at line 4330 of file table.cpp.
void TTable::Save | ( | TSOut & | SOut | ) |
Saves table schema and content to a binary format.
Note that TTableContext must be saved separately as it can be shared among multiple tables.
Definition at line 854 of file table.cpp.
void TTable::SaveBin | ( | const TStr & | OutFNm | ) |
Saves table schema and content to a binary file.
Definition at line 849 of file table.cpp.
void TTable::SaveSS | ( | const TStr & | OutFNm | ) |
Saves table schema and content to a TSV file.
Definition at line 800 of file table.cpp.
void TTable::Select | ( | TPredicate & | Predicate, |
TIntV & | SelectedRows, | ||
TBool | Remove = true |
||
) |
Selects rows that satisfy given Predicate
.
Select. Has two modes of operation:
Definition at line 2750 of file table.cpp.
|
inline |
Definition at line 1266 of file table.h.
void TTable::SelectAtomic | ( | const TStr & | Col1, |
const TStr & | Col2, | ||
TPredComp | Cmp, | ||
TIntV & | SelectedRows, | ||
TBool | Remove = true |
||
) |
Selects rows using atomic compare operation.
Select atomic - optimized cases of select with predicate of an atomic form: compare attribute to attribute or compare attribute to a constant
Definition at line 2813 of file table.cpp.
Definition at line 1278 of file table.h.
void TTable::SelectAtomicConst | ( | const TStr & | Col, |
const TPrimitive & | Val, | ||
TPredComp | Cmp, | ||
TIntV & | SelectedRows, | ||
PTable & | SelectedTable, | ||
TBool | Remove = true , |
||
TBool | Table = true |
||
) |
Selects rows where the value of Col
matches given primitive Val
.
Definition at line 2873 of file table.cpp.
|
inline |
Definition at line 1290 of file table.h.
|
inline |
Definition at line 1296 of file table.h.
Definition at line 1323 of file table.h.
|
inline |
Definition at line 1326 of file table.h.
Definition at line 1309 of file table.h.
|
inline |
Definition at line 1312 of file table.h.
Definition at line 1316 of file table.h.
|
inline |
Definition at line 1319 of file table.h.
void TTable::SelectFirstNRows | ( | const TInt & | N | ) |
Selects first N rows from the table.
Definition at line 3357 of file table.cpp.
Joins table with itself, on values of Col
.
Definition at line 1366 of file table.h.
|
inline |
Definition at line 1367 of file table.h.
PTable TTable::SelfSimJoinPerGroup | ( | const TStr & | GroupAttr, |
const TStr & | SimCol, | ||
const TStr & | DistanceColName, | ||
const TSimType & | SimType, | ||
const TFlt & | Threshold | ||
) |
Performs join if the distance between two rows is less than the specified threshold.
Returns table with schema (GroupId1, GroupId2, Similarity).
Definition at line 2094 of file table.cpp.
PTable TTable::SelfSimJoinPerGroup | ( | const TStrV & | GroupBy, |
const TStr & | SimCol, | ||
const TStr & | DistanceColName, | ||
const TSimType & | SimType, | ||
const TFlt & | Threshold | ||
) |
Performs join if the distance between two rows is less than the specified threshold.
SimJoinPerGroup performs SimJoin based on a set of attributes. Performs the grouping internally and returns a projection of the columns on which groupby was performed along with the similarity.
Definition at line 2180 of file table.cpp.
|
inline |
Sets the columns to be used as both src and dst node attributes.
Definition at line 1188 of file table.h.
|
inline |
Sets the name of the column to be used as dst nodes in the graph.
Definition at line 1167 of file table.h.
|
inlineprotected |
Sets the first valid row of the TTable.
Definition at line 811 of file table.h.
Definition at line 4152 of file table.cpp.
|
inlinestatic |
Definition at line 526 of file table.h.
|
inline |
Sets the name of the column to be used as src nodes in the graph.
Definition at line 1160 of file table.h.
PTable TTable::SimJoin | ( | const TStrV & | Cols1, |
const TTable & | Table, | ||
const TStrV & | Cols2, | ||
const TStr & | DistanceColName, | ||
const TSimType & | SimType, | ||
const TFlt & | Threshold | ||
) |
Performs join if the distance between two rows is less than the specified threshold.
Returns Similarity based join of two tables based on a given distance metric and a given threshold. Records (r1, r2) that are returned satisfy the criterion: d(r1, r2) <= Threshold
Definition at line 1994 of file table.cpp.
Splices table into subtables according to a grouping statement.
Definition at line 1808 of file table.cpp.
Adds entire flt column to table.
Definition at line 4104 of file table.cpp.
|
protected |
Parallel helper function for grouping. - we currently don't support such parallel grouping by complex keys.
Stores column for a group. Physical row ids have to be passed.
Definition at line 1310 of file table.cpp.
Adds entire int column to table.
Definition at line 4087 of file table.cpp.
Adds entire str column to table.
Definition at line 4121 of file table.cpp.
|
inlinestatic |
|
inlinestatic |
PTable TTable::ThresholdJoin | ( | const TStr & | KeyCol1, |
const TStr & | JoinCol1, | ||
const TTable & | Table, | ||
const TStr & | KeyCol2, | ||
const TStr & | JoinCol2, | ||
TInt | Threshold, | ||
TBool | PerJoinKey = false |
||
) |
Definition at line 2644 of file table.cpp.
|
protected |
Definition at line 2506 of file table.cpp.
|
protected |
Definition at line 2557 of file table.cpp.
|
protected |
Definition at line 2478 of file table.cpp.
|
protected |
Definition at line 2608 of file table.cpp.
|
protected |
Definition at line 2622 of file table.cpp.
Creates a sequence of graphs based on grouping specified by GroupAttr.
Definition at line 3662 of file table.cpp.
Creates the graph sequence one at a time.
Create the graph sequence one at a time, to allow efficient use of memory. A call to this function must be followed by subsequent calls to NextGraphIterator().
Definition at line 3676 of file table.cpp.
TVec< PNEANet > TTable::ToGraphSequence | ( | TStr | SplitAttr, |
TAttrAggr | AggrPolicy, | ||
TInt | WindowSize, | ||
TInt | JumpSize, | ||
TInt | StartVal = TInt::Mn , |
||
TInt | EndVal = TInt::Mx |
||
) |
Creates a sequence of graphs based on values of column SplitAttr and windows specified by JumpSize and WindowSize.
Definition at line 3651 of file table.cpp.
PNEANet TTable::ToGraphSequenceIterator | ( | TStr | SplitAttr, |
TAttrAggr | AggrPolicy, | ||
TInt | WindowSize, | ||
TInt | JumpSize, | ||
TInt | StartVal = TInt::Mn , |
||
TInt | EndVal = TInt::Mx |
||
) |
Creates the graph sequence one at a time.
Create the graph sequence one at a time, to allow efficient use of memory. A call to this function must be followed by subsequent calls to NextGraphIterator().
Definition at line 3666 of file table.cpp.
TVec< PNEANet > TTable::ToVarGraphSequence | ( | TStr | SplitAttr, |
TAttrAggr | AggrPolicy, | ||
TIntPrV | SplitIntervals | ||
) |
Creates a sequence of graphs based on values of column SplitAttr and intervals specified by SplitIntervals.
Definition at line 3657 of file table.cpp.
PNEANet TTable::ToVarGraphSequenceIterator | ( | TStr | SplitAttr, |
TAttrAggr | AggrPolicy, | ||
TIntPrV | SplitIntervals | ||
) |
Creates the graph sequence one at a time.
Create the graph sequence one at a time, to allow efficient use of memory. A call to this function must be followed by subsequent calls to NextGraphIterator().
Definition at line 3671 of file table.cpp.
Returns union of this table with given Table
.
Definition at line 4531 of file table.cpp.
Definition at line 1413 of file table.h.
Returns union of this table with given Table
, preserving duplicates.
Definition at line 4511 of file table.cpp.
Definition at line 1416 of file table.h.
void TTable::UnionAllInPlace | ( | const TTable & | Table | ) |
Same as TTable::ConcatTable.
Definition at line 4524 of file table.cpp.
|
inline |
Definition at line 1419 of file table.h.
void TTable::Unique | ( | const TStr & | Col | ) |
Removes rows with duplicate values in given column.
Definition at line 1266 of file table.cpp.
Removes rows with duplicate values in given columns.
Definition at line 1298 of file table.cpp.
void TTable::UpdateFltFromTable | ( | const TStr & | KeyAttr, |
const TStr & | UpdateAttr, | ||
const TTable & | Table, | ||
const TStr & | FKeyAttr, | ||
const TStr & | ReadAttr, | ||
TFlt | DefaultFltVal = 0.0 |
||
) |
Definition at line 4242 of file table.cpp.
void TTable::UpdateFltFromTableMP | ( | const TStr & | KeyAttr, |
const TStr & | UpdateAttr, | ||
const TTable & | Table, | ||
const TStr & | FKeyAttr, | ||
const TStr & | ReadAttr, | ||
TFlt | DefaultFltVal = 0.0 |
||
) |
Definition at line 4174 of file table.cpp.
|
protected |
|
protected |
|
protected |
Updates table state after adding one or more rows.
Definition at line 4140 of file table.cpp.
|
friend |
|
friend |
|
friend |
|
friend |
|
friend |
|
friend |
|
friend |
|
friend |
|
friend |
|
friend |
|
friend |
|
friend |
|
friend |
|
friend |
|
friend |
|
friend |
|
protected |
|
protected |
|
protected |
|
protected |
|
protected |
|
protected |
|
protected |
|
protected |
|
protected |
|
staticprotected |
|
protected |
|
staticprotected |
|
protected |
|
protected |
|
protected |
|
protected |
Partitioning of row ids into buckets corresponding to different graph objects when generating a sequence of graphs.
Example: <T_1.age,T_2.age, age> - T_1.age is a src node attribute, T_2.age is a dst node attribute. However, since all nodes refer to the same universe of entities (users) we just do one assignment of age per node, and call that attribute 'age'. This list should be very small.
|
protected |
|
protected |
|
protected |
|
protected |
|
staticprotected |