1.6.5.8.11.1. Caching for versions up to 9.07

1.6.5.8.11.1.1. Description of different caches

Excerpt from $CADENAS_SETUP -> geomsearch.cfg

[Cache_Local_Search]
maxOpenIndexCount=100
linIndexCacheSize=0
sampleLineListCacheSize=0
pivotDistListCacheSize=0
logFileName=

[Cache_Server_Search]
maxOpenIndexCount=100
linIndexCacheSize=150000
sampleLineListCacheSize=800000
pivotDistListCacheSize=50000
logFileName=

[Note]Note

The cache can be adjusted for the local search and for the search via search server separately.

Caching also works with PARTdataManager, but cannot be used effectively here as the cache is lost each time it is restarted. Therefore, the values linIndexCacheSize, sampleLineListCacheSize and pivotDistListCacheSize are set to '0' for the local search.

Below you will find a description of the individual caches:

  1. GeoIndexCache: Prevents the index from being reopened for every new search.[34]The set value corresponds to the maximum number of open indexes.

    Set the value to the number of catalogs to be searched through (in this case for example '100').

    maxOpenIndexCount=100

  2. sampleLineListCache:

    Cache for fingerprints

    (Cache is together for all Threads [usually corresponds to the number of processor cores] (compare $CADENAS_SETUP/partsol.cfg -> Block "SEARCHSERVER" -> key "THREADS")

    Example: 80% of 1GB available RAM (indication in KB)

    sampleLineListCacheSize=800000

    [Recommendation: For first setting 80%]

  3. linIndexCache: (not used for search of sketches)

    Cache for linear index

    Example: 15% of 1GB available RAM (indication in KB)

    linIndexCacheSize=150000

    [Recommendation: For first setting 15%]

  4. pivotDistListCache: (not used for search of sketches)

    Cache for linear index:[35]

    (Cache is for all threads together [usually corresponds to the number of processor cores] (compare $CADENAS_SETUP/partsol.cfg -> Block "SEARCHSERVER" -> key "THREADS")

    Example: 5% of 1GB available RAM (information in KB)

    pivotDistListCacheSize=50000

    [Recommendation: For the initial setting 5%]

1.6.5.8.11.1.2. Log file evaluation - Find best settings

You can optimize the settings in 2 steps:

  1. Set the settings in the first step according to general experience:

    1. Determine the percentage of working memory you can allocate to the cache without restricting other processes.

    2. Divide these up in the working memory as follows:

      • sampleLineListCacheSize: 80%

      • linIndexCacheSize: 15%

      • pivotDistListCacheSize: 5%

    3. Enter the result values in KB into the above named keys.

    4. Set the key value from maxOpenIndexCount to the number of catalogs to be searched through..

    [Note]Note

    If you set all values to '0', the caching is deactivated.

  2. Optimization of values according to the evaluation of the log file

    In the configuration file geomseach.cfg you specify where the log file should be saved.

    [Note]Note

    After each search, a report is given indicating how many cache hits there were and how much storage space was used. The values are then not set back after the search. The statistic runs across all searches!

    When the PARTdataManager or the search server, the log file is deleted.

    In order to reach the optimal settings of the geometric search, please not the following before assessing the log file.

    • Conduct searches which are representative for a normal user behavior, for example at the selection of search templates, sketch search or 3D search, but also at the selection of search parts in addition.

    • Ideally, let the search run for several days (search server), or over a long period of using the PARTdataManager, before evaluating the log file.

Example of evaluating the log file

GeoIndexCache CacheHits 999 of 1000, 99%
GeoIndexCache Files 99 of 100, 99%

SampleLineListCache CacheHits 999 of 1000, 99%
SampleLineListCache Memory 400000 of 800000, 50%

LinIndexCache CacheHits 10617 of 10776, 98%
LinIndexCache Memory 90000 of 100000, 90%

PivotDistCache CacheHits 100 of 10000, 1%
PivotDistCache Memory 9999 of 10000, 99%

You do not need to change anything in the GeoIndexCache. This is set to the number of catalogs to be searched.

The following rules apply to the other three caches: (The SampleLineListCache is explained here as an example. The statements are transferable to LinIndexCache and PivotDistCache )

CacheHits

The first row shows the number of hits = measurement for quality

SampleLineListCache CacheHits 900 of 1000, 90%

  1. The first value (here 900) shows the number of accesses on elements that are already available in the cache.

    The second value (here 1000) shows the entire number of accesses.

  2. The second value shows the relationship between the two values in percent. Here: 90%.

Memory

The second row shows the use of the working memory:

SampleLineListCache Memory 600000 of 800000, 75%

  1. The first two values indicate how many KB were used from the KB value set in the configuration file.

    In this example 600000 KB of 800000 KB were used.

  2. The second value shows the relationship between the two values in percent. Here: 75%.

CacheHits is the measure for the quality of cache usage. If this value is high, the settings are OK.

Memory provides information as to whether the value set in the configuration file is correctly dimensioned. If you have 100% for CacheHits and 10% for Memory, you will get just as good CacheHits with significantly less allocation.

If the CacheHits are low (e.g. 10%) and the set cache is used completely (e.g. 100%), one should attempt to boost the hit rate by increasing the cache value.




[34] It just has to be opened once for each "Thread" [usually corresponds to the number of processor cores]. Each THREAD has its own cache.

[35] The linear index sorts parts according to distances to certain reference parts. These are called pivots.