1.6.5.8.11.2. Caching / geometric search index as of V9.08

1.6.5.8.11.2.1. Structure of indices

As of version 9.08 there are some changes concerning storage location and files:

Storage location

$CADENAS_DATA\index\cat\...\

The Geo Index and the Topology Index are saved separately now.

The new Geo-Index is located in the subfolder geoindexv2 (next to the "old" geoindex), the Topo-Index in the folder topoindex.

The following files are located in the geoindexv2 directory:

  1. geomsearch.fdb:

    This file contains the fingerprints for the single algorithms.

  2. geomsearch.cidb:

    This file contains information about the part that is only relevant when generating the fingerprints, i.e. this information does not need to be read for the search.

  3. geomsearch.ldx:

    This file contains the linear index for the individual search templates.

  4. geomsearch.ofm: This file contains a mapping of project paths on the internal IDs of the Geo Index.

    In a certain mode the updates of the Geo Index are not directly integrated in the Geo Index, but saved in special files, in order to accelerate the procedure. More information on this is found under Section 1.6.5.8.11.2.3, “ Creation of Geo and Topo indices”. Therefor the following files are used:

    • geomsearch.ufdb: Contains the fingerprints of the changes.

    • geomsearch.ucidb: Contains information about changes that are not relevant for searches.

    • geomsearch.uop: Contains all the information required to integrate the fingerprints of the changes into the geo-index. The geo-index may be currently being edited.

      In this case another file is found in this directory:

      geomsearch.lock: Prevents the geo index from being accessed while writing.

The following files are located in the topoindex directory:

  1. topoindex.bin: This file contains the structured topo data.

  2. topoindex.idx: This file contains indices for searching the data.

    For the Topo Index there is also the possibility of quick updates. Then the following files are in the directory in addition:

    • topoindex.dupd: The structure data of the changes

    • topoindex.upd: Information to integrate the changes into the index. This directory may also be locked:

    • topoindex.lock: Prevents the topo index from being accessed while writing.

1.6.5.8.11.2.2.  geomsearch.cfg -> Block [settings] - Common settings

Some settings can be made under $CADENAS_SETUP/geomsearch.cfg. These are summarized in the [settings] block.

  1. Key: searchindex

    Determine, which Geo Index shall be used for searching. Possible values are:

    • old: Use old index

    • new: Use new index

    • both: Use new index, if not available, use old index

  2. Key: toposearchindex

    Determine, which Topo Index shall be used for searching. Possible values are:

    • old: Use old geo-index

    • new: Use new index

    • both: Use new index, if not available, use old geo-index

  3. Key: convertindex and converttopoindex

    Conversion of old indices in new indices. Possible values:

    • 0: Conversion deactivated

    • 1: Conversion active. The conversion takes place automatically when an old index is created.

  4. Key: createNewDirectly

    Control, whether the old or the new index shall be created. Possible values:

    • 0: Create old index

    • 1: Create new index

  5. Key: createNewLinIndex

    Shall the Linear Index be created again at the conversion? Possible values:

    • 0: Convert old linear index. As this index uses fewer pivot elements, the search is somewhat slower. However, the conversion is very fast.

    • 1: Create new linear index

1.6.5.8.11.2.3.  Creation of Geo and Topo indices

There are basically 3 options for generating the indices:

  1. New creation of the Geo Index

  2. Update of the existing index (working on copy). Here the available index can also be an old Geo Index. This, then is automatically converted.

  3. Update of the existing index (working with special update files). Makes sense, if there are only little changes at large catalogs.

    • Variant 3 is much quicker the variant 2, especially at large catalogs.

    • Variant 3 has the disadvantage, that the search becomes slower with each new update. That's why it makes sense to update the index using mode 2 from time to time, because then the updates are completely integrated in the index.

    • The Pivot elements of the Linear Index are not calculated anew in variant 3. Possibly this can have a negative effect on search times.

Further notes

  • The creation of Geo and Topo Index always happens together.

  • Basically applies that the 64 Bit variant has a better performance (similar as with the migration).

  • During the generation of the indices the directories of Geo and Topo index are locked. The variants 1 and 2 are working on a temporary copy. Here the directory is only locked for writing access at the beginning. This means, that the index can still be read. The index has to be locked for reading access only for a short moment, when the old files are deleted and the temporary files are renamed. The 3rd variant does not work on a copy. That's why the access is locked during the whole update.

  • The key ThreadCountForLinearIndexCreatorin the block settings_32 or settings_64 can be used to specify how many threads should be used to generate the linear index. If the value is set to -1, the number of cores is used. Each thread may require several hundred MB of RAM, which is why too high values for 32 bits do not make sense.

    Example:

    [settings_32]
    ThreadCountForLinearIndexCreator=2
    
    [settings_64]
    ThreadCountForLinearIndexCreator=-1

1.6.5.8.11.2.4. Changes at the actual search

There are 3 different search modes for the geo and topo index, depending on what has been defined in the geomsearch.cfg block settings (see above).

  • new: Only the new index is used, if not available, there are no results for this catalog.

  • both: If possible, the new index is used. Unavailable indices are converted in memory if necessary.

  • old: Search with old indexes

Furthermore the Linear Index is also used for the sketch search as of version 9.08 in order to reduce search times.

1.6.5.8.11.2.5.  Caching settings

All caching settings can be set differently for the PARTdataManager 32 Bit and 64 Bit variant.

Geo search
  • LinIndexCacheSize: Size of the cache for the linear index in KB

    Choose the value in a way, that it is not maximally used.

    If this is not possible, select a small primary memory (SampleLineCacheSize).

    [CACHEV2_GEO_SEARCH_32]
    LinIndexCacheSize=100000

    [CACHEV2_GEO_SEARCH_64]
    LinIndexCacheSize=300000

  • OffsetCacheSize: Size of the cache for the offset index in KB

    Choose the value in a way, that it is not maximally used.

    If this is not possible, select a small primary memory (SampleLineCacheSize).

    [CACHEV2_GEO_SEARCH_32]
    OffsetCacheSize=50000

    [CACHEV2_GEO_SEARCH_64]
    OffsetCacheSize=150000

  • GeoIndexV2CacheSize: Number of geo-indexes that can be open at the same time.

    Set the value in a way, that it is according to the maximal number of catalogs.

    [CACHEV2_GEO_SEARCH_32]
    GeoIndexV2CacheSize=1000

    [CACHEV2_GEO_SEARCH_64]
    GeoIndexV2CacheSize=1000

  • SampleLineCacheSize: Cache for fingerprints in KB

    Cache for all Threads (is normally according to the number of all processor cores altogether)

    [CACHEV2_GEO_SEARCH_32]
    SampleLineCacheSize=100000

    [CACHEV2_GEO_SEARCH_64]
    SampleLineCacheSize=500000

  • LogFileName: Log information is saved here if not empty. See below.

    Set the key yourself, if not existing.

    [CACHEV2_GEO_SEARCH_32]
    LogFileName=c:\log\cachev2_geo_search_32.log

    [CACHEV2_GEO_SEARCH_64]
    LogFileName=c:\log\cachev2_geo_search_64.log

[Important]Important

When setting values for the Geo search please regard the following rules:

  1. Set LinIndexCacheSize and OffsetCacheSize in a size, that they are not completely exhausted. If this is not possible, then set a smaller value for the superordinated storage SampleLineCacheSize.

  2. If there is still memory left, set SampleLineCacheSize as large as possible.

Including / excluding catalogs

The setting can be used for Server environments in order to exclude catalogs or to load only special catalogs.

#:VALS_S
#:HELP;default;Include catalog, if the expression matches.
PreloaderIncludeRegexPos=
#:HELP;default;Include catalog, if the expression doesn't match.
PreloaderIncludeRegexNeg=
#:VALS_S
#:HELP;default;Exlucde catalog, if the expression matches.
PreloaderExcludeRegexPos=.*copyright\.prj$|.*_qa$|.*_dev$
#:VALS_S
#:HELP;default;Exclude catalog, if the expression doesn't match.
PreloaderExcludeRegexNeg=

Also compare $CADENAS_USER/varsearch.cfg -> [VariableSearch:Path]:

The setting under geomsearch.cfg is used to cache the geometrical index, the one under $CADENAS_USER/varsearch.cfg [VariableSearch:Path] is used to cache the index for variable and full-text search.

Topo search
  • ObjectDataCacheSize: Cache for topo data nodes

    Especially important for the migration.

    [CACHE_TOPO_SEARCH_32]
    ObjectDataCacheSize=200000

    [CACHE_TOPO_SEARCH_64]
    ObjectDataCacheSize=1000000

  • IndexCacheSize: Cache for indices on the topo data

    Especially important for the Topo search.

    [CACHE_TOPO_SEARCH_32]
    IndexCacheSize=200000

    [CACHE_TOPO_SEARCH_64]
    IndexCacheSize=500000

  • LogFileName: Log information is saved here if not empty. See also Section 1.6.5.8.11.2.6, “ Log file evaluation - Find best settings”.

    Set the key yourself, if not available.

    [CACHE_TOPO_SEARCH_32]
    LogFileName=c:\log\cachev2_topo_search_32.log

    [CACHE_TOPO_SEARCH_64]
    LogFileName=c:\log\cachev2_topo_search_64.log

1.6.5.8.11.2.6.  Log file evaluation - Find best settings

The above-mentioned log files (key LogFileName) can be used to optimize the search settings for the available data and search behaviour. The log file shows how full the respective caches are and how often the element was actually in the cache when it was accessed (cache hit).

Example:

Search from Do 12. Dez 22:41:09 2013
Thread: 0xce8

Geometrical index cache: 
Capacity of the cache: 1000
In use: 18 (1.80%)
Free: 982 (98.20%)
Accesses to the cache: 6489
Cache hits: 6471 (99.72%)
Cache misses: 18 (0.28%)

Linear index cache: 
Capacity of the cache: 100000
In use: 4962 (4.96%)
Free: 95038 (95.04%)
Accesses to the cache: 1932
Cache hits: 1615 (83.59%)
Cache misses: 317 (16.41%)

Offset index cache: 
Capacity of the cache: 50000
In use: 1617 (3.23%)
Free: 48383 (96.77%)
Accesses to the cache: 7033
Cache hits: 6716 (95.49%)
Cache misses: 317 (4.51%)

Sample lines cache: 
Capacity of the cache: 100000
In use: 82785 (82.78%)
Free: 17215 (17.22%)
Accesses to the cache: 16840
Cache hits: 10985 (65.23%)
Cache misses: 5855 (34.77%)

Some notes for the interpretation of data:

  1. Possibly the log file is not updated until a further search is performed.

  2. In order to make effective settings with the help of the log file, it is important to perform several searches, which are adequate to the normal user behavior, for example concerning the selection of search templates, sketch search or 3D search, but also concerning the selection of search parts.

  3. The information on cache hits and cache misses, which are readout with each search, are cumulative.

  4. The data are loaded into the cache not until the first access has been performed. That's why the first access on data is always a cache miss. The more searches have been performed, the less cache misses should occur.

1.6.5.8.11.2.7. Access on Topology values via VBS

You can access the Topo values in the following way:

' main class to manage topology
set topoManager = CreateObject("cnstools.topomanager")

' fetch root node of topology tree
set catalogRoot = topoManager.findCatalogRoot("cat/norm/din")
stdprint("Number of project in din: " & catalogRoot.childCount)

' fetch project node
set prjNode = topoManager.findProjectNode("cat/norm/din", "anlagenbau/armaturen/
 din_11864_1_a_asmtab.prj")
stdprint("Number of lines in anlagenbau/armaturen/din_11864_1_a_asmtab.prj: " 
 & prjNode.childCount)

' fetch line node
set lineNode = topoManager.findLineNode("cat/norm/din", "anlagenbau/armaturen/
 din_11864_1_a_asmtab.prj", 20)

' helper to recursively print all the attributes of a node
sub printAttributes(node, indent)
	stdprint(indent & node.name)
	c = node.childCount
	for j = 0 to c - 1
		printAttributes(node.child(j), indent & "  ")
	next
	a = node.attributeCount
	for j = 0 to a - 1
		set attr = node.attribute(j)
		value = attr.value
		valueAsString = ""
		set attrType = attr.type
		if attrType = "doubleVec" then
			n = value.count
			for k = 0 to n - 1
				if k > 0 then
					valueAsString = valueAsString & ", "
				end if
				valueAsString = valueAsString & value.item(k)
			next
		else
			valueAsString = value
		end if
		stdprint(indent & "  " & attr.name & ": " & valueAsString)
	next
end sub

' print all attributes for a line
stdprint("Recursive list of all attributes in line 20:")
stdprint()
printAttributes(lineNode, "")
stdprint()

' print all attributes for a stl file
stdprint("Recursive list of all attributes in stl:")
stdprint()
set stlNode = topoManager.createNodeFrom3DFile("D:/stl/1 stl/001952002.stl", "STLFILE")
printAttributes(stlNode, "")

' print all attributes for a prt file
stdprint()
stdprint("Recursive list of all attributes in prt file:")
stdprint()
set prtNode = topoManager.createNodeFrom3DFile("D:/stl/ein paar proe-Dateien/
 1202t4100_gen.prt.1", "NATWILDFIREPART 5 32 BIT")
printAttributes(prtNode, "")

' print all attributes for a line (create fingerprints on the fly)
stdprint()
stdprint("Recursive list of all attributes in line 420:")
stdprint()
set lineNode2 = topoManager.createNodeFromProject("cat/norm/din", "anlagenbau/armaturen/
 din_11864_1_a_asmtab.prj", 420)
printAttributes(lineNode2, "")

' print all attributes for a project (create fingerprints on the fly)
stdprint()
stdprint("Recursive list of all attributes in project anlagenbau/armaturen/
 din_11864_1_a_asmtab.prj:")
stdprint()
set prjNode2 = topoManager.createNodeFromProject("cat/norm/din", "anlagenbau/armaturen/
 din_11864_1_a_asmtab.prj", -1)
printAttributes(prjNode2, "")