![]() |
Multilevel Deduplication Engine (MDE)
|
The main MDENode structure. This class can be used as-is with a type or derived for additional functionality as needed. More...
#include <mde.hpp>

Classes | |
| struct | Index |
| Index returned by an operation. Being defined inside the class ensures type safety and possible future extensions. More... | |
| struct | PropertySetHolder |
| class | PropertySetStorage |
Public Types | |
| using | PropertyT = typename Config::PropertyT |
| using | PropertyLess = typename Config::PropertyLess |
| using | PropertyHash = typename Config::PropertyHash |
| using | PropertyEqual = typename Config::PropertyEqual |
| using | PropertyPrinter = typename Config::PropertyPrinter |
| using | Nesting = NestingT |
| using | PropertyElement = typename Nesting::template PropertyElement< PropertyLess, PropertyHash, PropertyEqual, PropertyPrinter > |
| using | PropertySet = std::vector< PropertyElement > |
| using | PropertySetHash = SetHash< PropertySet, PropertyElement, typename PropertyElement::Hash > |
| using | PropertySetFullEqual = SetEqual< PropertySet, PropertyElement, typename PropertyElement::FullEqual > |
| using | PropertySetMap = MapAdapter< std::unordered_map< const PropertySet *, IndexValue, PropertySetHash, PropertySetFullEqual > > |
| using | UnaryOperationMap = OperationMap< IndexValue > |
| using | BinaryOperationMap = OperationMap< OperationNode > |
| using | RefList = typename Nesting::MDEReferenceList |
Public Member Functions | |
| MDENode (RefList reflist={}) | |
| bool | is_empty (const Index &i) const |
| void | clear_and_initialize () |
| Resets state of the MDE to the default. | |
| SubsetRelation | is_subset (const Index &a, const Index &b) const |
| Returns whether we currently know whether a is a subset or a superset of b. | |
| Index | register_set_single (const PropertyElement &c) |
| Inserts a (or gets an existing) single-element set into property set storage. | |
| Index | register_set_single (const PropertyElement &c, bool &cold) |
| Inserts a (or gets an existing) single-element set into the property set storage, and reports whether this set was already present or not. | |
| void | prepare_vector_set (PropertySet &c) |
| template<bool disable_integrity_check = false> | |
| Index | register_set (const PropertySet &c) |
| template<bool disable_integrity_check = false> | |
| Index | register_set (const PropertySet &c, bool &cold) |
| template<bool disable_integrity_check = false> | |
| Index | register_set (PropertySet &&c) |
| template<bool disable_integrity_check = false> | |
| Index | register_set (PropertySet &&c, bool &cold) |
| template<typename Iterator , bool disable_integrity_check = false> | |
| Index | register_set (Iterator begin, Iterator end) |
| template<typename Iterator , bool disable_integrity_check = false> | |
| Index | register_set (Iterator begin, Iterator end, bool &cold) |
| const PropertySet & | get_value (const Index &index) const |
| Gets the actual property set specified by index. | |
| Size | property_set_count () const |
| Returns the total number of property sets currently in the MDE. | |
| Size | size_of (const Index &index) const |
Returns the size of the set at index | |
| OptionalRef< PropertyElement > | find_key (const Index &index, const PropertyT &p) const |
| Finds a property element in the set based on the key provided. | |
| bool | contains (const Index &index, const PropertyElement &prop) const |
Determines whether the property set at index contains the element prop or not. | |
| Index | set_union (const Index &_a, const Index &_b) |
Calculates, or returns a cached result of the union of a and b | |
| Index | set_insert_single (const Index &a, const PropertyElement &b) |
| Inserts a single element from a given set (and returns the index of the set). This is a wrapper over the union operation. | |
| Index | set_difference (const Index &a, const Index &b) |
Calculates, or returns a cached result of the difference of a from b | |
| Index | set_remove_single (const Index &a, const PropertyElement &b) |
| Removes a single element from a given set (and returns the index of the set). This is a wrapper over the diffrerence operation. | |
| Index | set_remove_single_key (const Index &a, const PropertyT &p) |
| Removes a single element from a given set if the "key" element matches. | |
| Index | set_intersection (const Index &_a, const Index &_b) |
Calculates, or returns a cached result of the intersection of a and b | |
| Index | set_filter (Index s, std::function< bool(const PropertyElement &)> filter_func, UnaryOperationMap &cache) |
| Filters a set based on a criterion function. This is supposed to be an abstract filtering mechanism that derived classes will use to implement caching on a filter operation rather than letting them implement their own. | |
| String | property_set_to_string (const Index &idx) const |
| String | dump () const |
| Dumps the current state of the MDE to a string. | |
| String | dump_perf () const |
| Dumps performance information as a string. | |
Static Public Member Functions | |
| static bool | less (const PropertyElement &a, const PropertyElement &b) |
| Less than comparator for operations. You MUST use this instead of directly using anything else like "<". | |
| static bool | less_key (const PropertyElement &a, const PropertyT &b) |
| static bool | less_key (const PropertyElement &a, const PropertyElement &b) |
| static bool | equal (const PropertyElement &a, const PropertyElement &b) |
| Equality comparator for operations. You MUST use this instead of directly using anything else like "<". | |
| static bool | equal_key (const PropertyElement &a, const PropertyT &b) |
| static bool | equal_key (const PropertyElement &a, const PropertyElement &b) |
| static String | property_set_to_string (const PropertySet &set) |
| Converts the property set to a string. | |
Static Public Attributes | |
| static constexpr const char * | name = Config::name |
| static constexpr Size | BLOCK_SIZE = Config::BLOCK_SIZE |
| static constexpr Size | BLOCK_MASK = Config::BLOCK_MASK |
| static constexpr Size | BLOCK_SHIFT = Config::BLOCK_SHIFT |
Protected Member Functions | |
| void | store_subset (const Index &a, const Index &b) |
Stores index a as the subset of index b if a < b, else stores index a as the superset of index b | |
| virtual void | clear () |
| Removes all data from the MDE. | |
Protected Attributes | |
| RefList | reflist |
| PerformanceStatistics | stat |
| HashMap< String, OperationPerf > | perf |
| PropertySetStorage | property_sets = {} |
| PropertySetMap | property_set_map = {} |
| BinaryOperationMap | unions = {} |
| BinaryOperationMap | intersections = {} |
| BinaryOperationMap | differences = {} |
| InternalMap< OperationNode, SubsetRelation > | subsets = {} |
Friends | |
| std::ostream & | operator<< (std::ostream &os, const MDENode &obj) |
The main MDENode structure. This class can be used as-is with a type or derived for additional functionality as needed.
| PropertyT | The type of the property. The property type must satisfy the following:
|
| PropertyLess | Custom less-than comparator (if required) |
| PropertyHash | Custom hasher (if required) |
| PropertyEqual | Custom equality comaparator (if required) |
| PropertyPrinter | PropertyT string representation generator |
| NestingT | Nesting behaviour of the MDE |
| using mde::MDENode< Config, NestingT >::BinaryOperationMap = OperationMap<OperationNode> |
| using mde::MDENode< Config, NestingT >::Nesting = NestingT |
| using mde::MDENode< Config, NestingT >::PropertyElement = typename Nesting::template PropertyElement< PropertyLess, PropertyHash, PropertyEqual, PropertyPrinter> |
| using mde::MDENode< Config, NestingT >::PropertyEqual = typename Config::PropertyEqual |
| using mde::MDENode< Config, NestingT >::PropertyHash = typename Config::PropertyHash |
| using mde::MDENode< Config, NestingT >::PropertyLess = typename Config::PropertyLess |
| using mde::MDENode< Config, NestingT >::PropertyPrinter = typename Config::PropertyPrinter |
| using mde::MDENode< Config, NestingT >::PropertySet = std::vector<PropertyElement> |
| using mde::MDENode< Config, NestingT >::PropertySetFullEqual = SetEqual< PropertySet, PropertyElement, typename PropertyElement::FullEqual> |
| using mde::MDENode< Config, NestingT >::PropertySetHash = SetHash< PropertySet, PropertyElement, typename PropertyElement::Hash> |
| using mde::MDENode< Config, NestingT >::PropertySetMap = MapAdapter<std::unordered_map< const PropertySet *, IndexValue, PropertySetHash, PropertySetFullEqual> > |
The structure responsible for mapping property sets to their respective unique indices. When a key-value pair is actually inserted into the map, the key is a pointer to a valid storage location held by a member of the property set storage vector.
Careful handling, especially in the case of reallocating structures like vectors is needed so that the address of the data does not change. It must remain static for the duration of the existence of the MDE instance.
| using mde::MDENode< Config, NestingT >::PropertyT = typename Config::PropertyT |
| using mde::MDENode< Config, NestingT >::RefList = typename Nesting::MDEReferenceList |
| using mde::MDENode< Config, NestingT >::UnaryOperationMap = OperationMap<IndexValue> |
|
inlineexplicit |
|
inlineprotectedvirtual |
|
inline |
|
inline |
|
inline |
|
inline |
|
inlinestatic |
|
inlinestatic |
|
inlinestatic |
|
inline |
|
inline |
|
inline |
|
inlinestatic |
|
inlinestatic |
|
inlinestatic |
|
inline |
Deduplicates and sorts a vector (to function equivalently to a set).
The deduplicate-sort function here is based on a stackoverflow answer (chosen for speed metrics): https://stackoverflow.com/a/24477023
Ideally, this shouldn't be used or require being used.
|
inline |
|
inlinestatic |
|
inline |
|
inline |
|
inline |
|
inline |
|
inline |
|
inline |
Inserts a (or gets an existing) single-element set into the property set storage, and reports whether this set was already present or not.
| [in] | c | The single-element property set. |
| [out] | cold | Report if this was a cold miss. |
|
inline |
Filters a set based on a criterion function. This is supposed to be an abstract filtering mechanism that derived classes will use to implement caching on a filter operation rather than letting them implement their own.
| [in] | s | The set to filter |
| [in] | filter_func | The filter function (can be a lambda) |
| cache | The cache to use (possibly defined by the user) |
Implement sort bound optimization
Implement bounding as a seperate function instead
|
inline |
|
inline |
|
staticconstexpr |
|
staticconstexpr |
|
staticconstexpr |
|
protected |
|
protected |
|
protected |
|
protected |
|
protected |
|
protected |
|
protected |
|
protected |
|
protected |