What is Data Hoarding and How it Works

Data hoarding

Hoarding is the practice of preloading data into the cache in advance of disconnection in order for the client to continue operating while unconnected. Hoarding is comparable to prefetching, which is used to increase efficiency in file and database systems.

There are, however, significant distinctions between hoarding and prefetching. Prefetching is a continuous procedure that uploads soon-to-be-used files to the cache during periods of low network traffic. Because prefetching is a continual process, in contrast to hoarding, it is critical to keep its overhead minimal. Additionally, hoarding is more crucial than prefetching, as a cache miss cannot be addressed during disconnections.

data hoarding
Data Hoarding

As a result, data hoarding tends to overstate the client’s data requirement. On the other hand, because the mobile client’s cache is a limited resource, excessive estimations cannot be met. A critical parameter is the hoarding unit, which can range from a disc block to a file, or even groups of files or directories. Another consideration is when to begin hoarding. The Coda file system [KS92] occasionally performs a process called horde walk to guarantee that vital files are cached for the mobile user.

The decision on which files to the cache can be either

(a) assisted by instructions explicitly given by the user or

(b) taken automatically by the system by utilizing implicit information, which is most often based on the past history of file references.

Coda [KS92) combines both approaches in deciding which data to hoard. Data are prefetched using priorities based on a combination of recent reference history and user-defined hoard files.

ITLA-1-951 proposes a tree-based approach for constructing an execution tree by processing the history of file references. The tree’s nodes reflect the programs and data files that are referenced. When programme A calls programme B or when programme A uses file B, an edge exists between parent node A and child node B. A graphical user interface (GUI) is utilised to help the user in implementing this tracing tool for the purpose of determining which files to hoard.

Apart from providing clarity to users, this strategy aids in distinguishing between files read throughout numerous executions of the same programme. Seer tKue94] is a user-aware predictive caching system. Automatic prefetching of files is based on a metric called semantic distance, which measures how closely connected two files are.

The measure chosen is the local reference distance from file A to file B. This distance can be informally be defined as the number of file references separating two adjacent references to A and B in the history of past file references.

Show More
0 0 votes
Article Rating
Notify of
Inline Feedbacks
View all comments
Back to top button
Would love your thoughts, please comment.x