Module | Name | Version | License | Source | Languages | Platforms | Type | Author | Description |
---|---|---|---|---|---|---|---|---|---|
FSArch | Archiver on the file system | 3.8 | GPL2 | arh_FSArch.so | en,uk,ru,de | x86,x86_64,ARM | Archive | Roman Savochenko Maxim Lysenko (2009) — the page initial translation |
The archiver module. Provides functions for messages and values archiving to the file system. |
The module is designed for archiving messages and values of OpenSCADA on the file system.
Any SCADA system provides the ability to archive the acquired data, i.e. formation of history of the changes (dynamics) of processes. Archives conditionally can be divided into two types: archives of messages and archives of values.
A specific of the message archives is that so-called events are archived. A characteristic sign of the event is its time of occurrence. The message archives are usually used for archiving messages of the program, i.e. conducting of logs and reports. Depending on the source the messages can be classified according to different criteria. For example, this may be: reports of emergency situations, reports of actions of the operators, reports of the connection failures and other.
A specific of the value archives is their periodicity, which is determined by the time interval between two adjacent values. The value archives are used for archiving the history of continuous processes. Since the process is continuous, it can be archived only by introducing the concept of quantization of the polling time, otherwise we will receive archives of infinite dimensions because of the continuity of the very nature of the process. In addition, practically, we can get value from the time limited by the data sources. For example, a fairly high-quality data sources in the industry, are rarely allowed to receive data at a frequency of more than 1kHz. And this is without taking into account of the sensors themselves, which have even less qualitative characteristics.
For conducting the archives in OpenSCADA, the subsystem "Archives-History" is provided. This subsystem, according to the types of archives, consists two parts: the message archive and the value archives. The subsystem, in general, is a modular one that allows you to create archives based on different nature and methods of data storage. This module provides a mechanism for archiving to the file system for both the flow of messages and values.
Contents
1 Message archiver
Archives of messages are formed by archivers, which can be many and with individual settings, which allows separating archiving of different classes of messages.
The message archiver of this module allows you to store data in XML files or in the plain-text format. The markup language XML is a standard format that is easily understood by a lot of exterior applications. However, opening and reviewing of the files in this format requires considerable resources. On the other hand, the plain-text format requires far fewer resources, although not uniform, but also requires knowledge of its structure to externally deal with.
In any case, the both formats are supported and the user can select any of them in accordance with his requirements.
Archive files are named by archivers according to the date of the first message in the archive, for example: "2018-05-03 17.57.03.msg".
Files of the archive can be limited in size and time. After exceeding the limit a new file is created. Maximum number of files in the archiver directory can also be restricted. After exceeding the limit on the number of files old files will be deleted!
In order to optimize the use of disk space, archivers support the packaging of old archives by the gzip packer. The packaging is carried out after a long non-use of the archive.
When you are using the archives in the form of XML, appropriate files are loaded entirely! For a long time unused archives unloading the timeout of access to the archive is used, after the exceeding of which the archive is unloaded from memory and then is packaged.
The module provides additional settings for the archiving process, Figure 1.
These additional parameters:
- Files of the archive in XML — enables archiving of the messages to files in the XML-format, rather than plain text. Using in archiving the XML-format requires more RAM because it requires: full file download, XML-parsing and storing into the memory at the time of use.
- Prevent duplicates — enables checking for duplicate messages when placing messages in an archive. If there is a duplicate, the message is not located in the archive. This function slightly increases the time of recording to the archive, but, in the case of location of messages in the archive by the back-time from external sources, allows you to exclude duplication.
- Consider duplicates and prevent, for equal time, category, level — enables checking for duplicate messages when placing messages in the archive. Duplicate messages are considered equal in time, category and level. The new duplicate message replaces the old one in the archive. This feature is mainly useful for changing the text of a message, for example, for violation status.
- Maximum size of archive's file, in kilobytes — sets limit on the size of one archive file. Disabling the restriction can be performed by setting the parameter to zero.
- Maximum number of the files — limits the maximum number of archive files and, together with the size of a single file, determines the size of the archive on the disk. Completely remove this restriction by setting the parameter to zero.
- Time size of the archive files, in days — sets limit on the size of one archive file by time.
- Timeout packaging archive files, in minutes — sets the time after which, in the absence of requests, the archive file will be packaged in a gzip archive. Set to zero for disabling the packaging by gzip.
- Period of the archives checking, in minutes — sets the periodicity of checking archives for the appearance or deletion in the archive files folder, as well as exceeding the limits and removing old archives files.
- Use info file for packaged archives — points to the need to create a file containing information about the packed gzip-archiver archive files. When copying the archive files to another station, this information file allows you to accelerate the process of first launching the destination station by eliminating the need to unzip the gzip-archives for the information.
- Check now for the directory of the archiver — command that allows you to run the archives checking immediately, for example, after manually modifying the archiver folder.
To control the archiver files you can view the tab "Files", Figure 2.
1.1 Files format of the message archive
The table below shows the syntax of the archive file built in XML-language:
Tag | Description | Attributes | Contain |
---|---|---|---|
FSArch | The root element. Identifies the file as belonging to the module. |
Version — version of the archive file; |
(m) |
m | Tag of the separate message. |
tm — time of creation of the message (hex - UTC in seconds from 01/01/1970); |
Text of the message |
Archive file on basis of the plain-file consists of:
- header in the format: "FSArch {vers} {charset} {beg_tm} {end_tm}"; Where:
- vers — version of the archiving module;
- charset — code page of the file, usually UTF8;
- beg_tm — UTC start time for the archive from 01.01.1970, in hexadecimal form;
- end_tm — UTC end time for the archive 01.01.1970, in hexadecimal form.
- records of the messages in the format: "{tm} {lev} {cat} {mess}"; Where:
- tm — message time in the format "{utc_sec}:{usec}", where:
- utc_sec — UTC time from 01.01.1970, in hexadecimal form;
- usec — microseconds of the time, in decimal form.
- lev — level of importance of the message;
- cat — category of the message;
- mess — text of the message.
- tm — message time in the format "{utc_sec}:{usec}", where:
Text of the message and its category are encoded to exclude the separator symbols — space character.
1.2 Example of the messages archive file
Example of the content of an archive file in the format of the XML language:
<?xml version='1.0' encoding='UTF-8' ?> <FSArch Version="3.2.0" Begin="5aeb630a" End="5aeb630a"> <m tm="5aeb630a" tmu="175076" lv="4" cat="/sub_Transport/mod_Sockets/in_Self/">Bind to TCP socket error: 'Address already in use (98)'!</m> <m tm="5aeb630a" tmu="175105" lv="4" cat="/sub_Transport/">AGLKS > Transports: Error starting the input transport 'Self'.</m> </FSArch>
Example of the content of an archive file in the plain-text format:
FSArch 3.2.0 UTF-8 5aec2095 5aec2097 5aec2095:951814 1 / AGLKS:%20Starting. 5aec2095:952258 4 /sub_Transport/mod_Sockets/in_Self/ Bind%20to%20TCP%20socket%20error:%20'Address%20already%20in%20use%20(98)'! 5aec2095:952286 4 /sub_Transport/ AGLKS%20>%20Transports:%20Error%20starting%20the%20input%20transport%20'Self'. 5aec2095:958411 4 /sub_Transport/mod_SSL/in_Self/ BIO_do_accept:%20error:20069075:BIO%20routines:BIO_get_accept_socket:unable%20to%20bind%20socket 5aec2095:959150 4 /sub_Transport/mod_SSL/in_WEB_1/ BIO_do_accept:%20error:20069075:BIO%20routines:BIO_get_accept_socket:unable%20to%20bind%20socket 5aec2095:960309 4 /sub_Transport/mod_SSL/in_WEB_2/ BIO_do_accept:%20error:20069075:BIO%20routines:BIO_get_accept_socket:unable%20to%20bind%20socket 5aec2096:117761 1 /sub_DAQ/mod_JavaLikeCalc/cntr_prescr/ AGLKS%20>%20Data%20Acquisition%20>%20JavaLikeCalc%20>%20prescr:%20Controller%20enabling. 5aec2096:118216 1 /sub_DAQ/mod_JavaLikeCalc/cntr_testCalc/ AGLKS%20>%20Data%20Acquisition%20>%20JavaLikeCalc%20>%20testCalc:%20Controller%20enabling. 5aec2096:118439 1 /sub_DAQ/mod_Siemens/cntr_test/ AGLKS%20>%20Data%20Acquisition%20>%20Siemens%20>%20test:%20Controller%20enabling. 5aec2096:120654 1 /sub_DAQ/mod_System/cntr_AutoDA/ AGLKS%20>%20Data%20Acquisition%20>%20System%20>%20AutoDA:%20Controller%20enabling. 5aec2096:124153 1 /sub_DAQ/mod_DAQGate/cntr_test/ AGLKS%20>%20Data%20Acquisition%20>%20DAQGate%20>%20test:%20Controller%20enabling. 5aec2096:124885 1 alDAQGate:test DAQ.DAQGate.test:%20connecting%20to%20the%20data%20source:%20OK. 5aec2096:127262 1 /sub_DAQ/mod_ModBus/cntr_testRTU/ AGLKS%20>%20Data%20Acquisition%20>%20ModBus%20>%20testRTU:%20Controller%20enabling. 5aec2096:127435 1 /sub_DAQ/mod_ModBus/cntr_testTCP/ AGLKS%20>%20Data%20Acquisition%20>%20ModBus%20>%20testTCP:%20Controller%20enabling. 5aec2096:127599 1 /sub_DAQ/mod_AMRDevs/cntr_test/ AGLKS%20>%20Data%20Acquisition%20>%20AMRDevs%20>%20test:%20Controller%20enabling. 5aec2096:127627 1 /sub_DAQ/mod_SoundCard/cntr_test/ AGLKS%20>%20Data%20Acquisition%20>%20SoundCard%20>%20test:%20Controller%20enabling. 5aec2096:127677 1 /sub_DAQ/mod_OPC_UA/cntr_test/ AGLKS%20>%20Data%20Acquisition%20>%20OPC_UA%20>%20test:%20Controller%20enabling. 5aec2096:130491 1 /sub_DAQ/mod_DCON/cntr_test/ AGLKS%20>%20Data%20Acquisition%20>%20DCON%20>%20test:%20Controller%20enabling. 5aec2096:130584 1 /sub_DAQ/mod_BlockCalc/cntr_Anast1to2node/ AGLKS%20>%20Data%20Acquisition%20>%20BlockCalc%20>%20Anast1to2node:%20Controller%20enabling. 5aec2096:138999 1 /sub_DAQ/mod_BlockCalc/cntr_Anast1to2node_cntr/ AGLKS%20>%20Data%20Acquisition%20>%20BlockCalc%20>%20Anast1to2node_cntr:%20Controller%20enabling. 5aec2096:143228 1 /sub_DAQ/mod_BlockCalc/cntr_KM101/ AGLKS%20>%20Data%20Acquisition%20>%20BlockCalc%20>%20KM101:%20Controller%20enabling. 5aec2096:149276 1 /sub_DAQ/mod_BlockCalc/cntr_KM102/ AGLKS%20>%20Data%20Acquisition%20>%20BlockCalc%20>%20KM102:%20Controller%20enabling. 5aec2096:155772 1 /sub_DAQ/mod_BlockCalc/cntr_KM102cntr/ AGLKS%20>%20Data%20Acquisition%20>%20BlockCalc%20>%20KM102cntr:%20Controller%20enabling. 5aec2096:156789 1 /sub_DAQ/mod_BlockCalc/cntr_KM201/ AGLKS%20>%20Data%20Acquisition%20>%20BlockCalc%20>%20KM201:%20Controller%20enabling. 5aec2096:163800 1 /sub_DAQ/mod_BlockCalc/cntr_KM301/ AGLKS%20>%20Data%20Acquisition%20>%20BlockCalc%20>%20KM301:%20Controller%20enabling. 5aec2096:169499 1 /sub_DAQ/mod_BlockCalc/cntr_KM302/ AGLKS%20>%20Data%20Acquisition%20>%20BlockCalc%20>%20KM302:%20Controller%20enabling. 5aec2096:175966 1 /sub_DAQ/mod_BlockCalc/cntr_КМ202/ AGLKS%20>%20Data%20Acquisition%20>%20BlockCalc%20>%20КМ202:%20Controller%20enabling. 5aec2096:182447 1 /sub_DAQ/mod_LogicLev/cntr_experiment/ AGLKS%20>%20Data%20Acquisition%20>%20LogicLev%20>%20experiment:%20Controller%20enabling. 5aec2096:194077 1 /sub_DAQ/mod_LogicLev/cntr_prescription/ AGLKS%20>%20Data%20Acquisition%20>%20LogicLev%20>%20prescription:%20Controller%20enabling. 5aec2097:365476 1 /sub_DAQ/mod_JavaLikeCalc/cntr_prescr/ AGLKS%20>%20Data%20Acquisition%20>%20JavaLikeCalc%20>%20prescr:%20Controller%20starting. 5aec2097:366684 1 /sub_DAQ/mod_System/cntr_AutoDA/ AGLKS%20>%20Data%20Acquisition%20>%20System%20>%20AutoDA:%20Controller%20starting. 5aec2097:367894 1 /sub_DAQ/mod_DAQGate/cntr_test/ AGLKS%20>%20Data%20Acquisition%20>%20DAQGate%20>%20test:%20Controller%20starting. 5aec2097:369061 1 /sub_DAQ/mod_ModBus/cntr_testTCP/ AGLKS%20>%20Data%20Acquisition%20>%20ModBus%20>%20testTCP:%20Controller%20starting. 5aec2097:369452 1 alModBus:testTCP DAQ.ModBus.testTCP:%20connection%20to%20data%20source:%20OK. 5aec2097:370335 1 /sub_DAQ/mod_OPC_UA/cntr_test/ AGLKS%20>%20Data%20Acquisition%20>%20OPC_UA%20>%20test:%20Controller%20starting. 5aec2097:371492 1 /sub_DAQ/mod_BlockCalc/cntr_Anast1to2node/ AGLKS%20>%20Data%20Acquisition%20>%20BlockCalc%20>%20Anast1to2node:%20Controller%20starting. 5aec2097:372185 1 alOPC_UA:test DAQ.OPC_UA.test:%20connect%20to%20data%20source:%20OK. 5aec2097:373148 1 /sub_DAQ/mod_BlockCalc/cntr_Anast1to2node_cntr/ AGLKS%20>%20Data%20Acquisition%20>%20BlockCalc%20>%20Anast1to2node_cntr:%20Controller%20starting. 5aec2097:374380 1 /sub_DAQ/mod_BlockCalc/cntr_KM101/ AGLKS%20>%20Data%20Acquisition%20>%20BlockCalc%20>%20KM101:%20Controller%20starting.
2 Value archiver
Archives of values, by default, are formed by the value archivers individually for each registered archive. Archivers can have plenty of individual settings that allow you to split archives according to different parameters, for example, in accuracy and depth.
Archive of values is an independent component, which includes buffer processed by the archivers. The main parameter of the value archive is a data source. As the data source role may be the attributes of the subsystem "Data acquisition" as well as other external data sources (passive mode). Other data sources could be: network archivers of remote OpenSCADA, programming environment of OpenSCADA etc. No less important parameters are the parameters of the archive buffer. From the buffer parameters the opportunity of working of archivers depends on. So, the periodicity of values in the buffer should be no more than the periodicity of the fastest archiver, and the buffer size is not less than double the size for the slowest archiver. Otherwise possible data loss.
General schema of archiving values is clearly shown in Figure 3.
Archive files are named by archivers according to the date of the first value in the archive and archive ID, for example: "CPULoad_load 2018-04-03 19.13.52.val."
Files of the archive can be limited in time. After exceeding the limit a new file is created. Maximum number of files in the archiver directory can also be restricted. After exceeding the limit on the number of files old files will be deleted!
In order to save disk space, archivers support packaging, in addition to the sequential packaging, of old archives by the gzip packer. Packaging is carried out after a long non-use of the archive. To allow fast connection of large archives to another station, it is possible to enable the use of an information file for the packaged files, which prevents the previous unpacking of all files at another station.
The module provides additional settings for the archiving process, Figure 4.
These additional parameters:
- Time size of the archive files, in hours — the parameter is set automatically when the value periodicity of the archiver is changed and is generally proportional to the periodicity of the values of the archiver.
- Large files of the archive will be longer processed due to the long unpacking of gzip files and initial indexing, while accessing archives is deeply in the history.
- Maximum number of files per archive — limits the maximum number of archive files and, together with the size of a single file, determines the size of the archive on the disk. Completely remove this restriction by setting the parameter to zero.
- Maximum size of all archives, in megabytes — sets limits on the maximum occupied volume of disk space for files of all archives of the archivers. The check is carried out with the periodicity of checking the archives (beyond), as a result of which, and for exceeding the limit, the oldest files of all archives are removed. Completely remove this restriction by setting the parameter to zero.
- Rounding for numeric values (%) — sets the marginal percentage of the difference between the values of the parameters of the integer and the real types at which they are considered identical and are arranged in the archive as one value, through sequential packaging. Allows you to well pack parameters beyond the validity — which are lightly changed. Disabling this property can be set to zero.
- Timeout packaging archive files, in minutes — sets the time after which, in the absence of requests, the archive file will be packaged in a gzip archive. Set to zero for disabling the packaging by gzip.
- Period of the archives checking, in minutes — sets the periodicity of checking archives for the appearance or deletion in the archive files folder, as well as exceeding the limits and removing old archives files.
- Use info file for packaged archives — points to the need to create a file containing information about the packed gzip-archiver archive files. When copying the archive files to another station, this information file allows you to accelerate the process of first launching the destination station by eliminating the need to unzip the gzip-archives for the information.
- Check now for the directory of the archiver — command that allows you to run the archives checking immediately, for example, after manually modifying the archiver folder.
To control the archiver files you can view the tab "Files", Figure 5.
2.1 Files format of the value archive
To implement the archiving to the file system the following requirements are to be done:
- quick and easy access to add to the archive and reading from the archive;
- possibility of changing the values of the existing archive, to fill holes in duplicate systems;
- cycle, size limits;
- possibility of the compression by the method of packaging the same values sequence that preserves the possibility of quick access — sequential packaging;
- possibility of packaging obsolete data by standard archivers (gzip, bzip2 ...), with the possibility of extracting on access.
According to the above requirements, the file multiplicity archiving method is organized (for each source). The archive cycle is implemented at the file level, that is, a new file is created, and the old one is deleted. For quick compression, we use the method of drawing to the last of the same value. For these purposes, the archive file provides a bit-packet size one by one with the amount of stored data. That is, each bit corresponds to one value in the archive. The value of the bit indicates the presence of a value. For a stream of identical values, the bits are reset. In the case of an archive of strings, the table is not bitwise and bytewise and contains the length of the corresponding value. In case of flow of identical values, the length will be zero and the first will be the same value. Since the table is byte then the archive can store strings up to 255 characters long. Thus, the methods of storing can be divided into a method of data of fixed and non-fixed size. The general structure of the archive file is shown in Figure 6.
When creating a new archive file, creates: the header (the header structure in Table 1), the zero bitmap of the archive packaging and the first not-valid value (EVAL). Thus, an archive generated by not-valid values is initialized. Forth, the new values will be inserted into the values field with the adjustment of the index packaging table. This follows that passive archives will degenerate into files of size the header and the bitmap table.
Table 1. The structure of the header of archive file
Field | Description | Size in bite(bit) |
---|---|---|
f_tp | System name of the archive ("OpenSCADA Val Arch.") | 20 |
archive | Name of the archive to which the file belongs. | 20 |
beg | Start time of the archive data, in microseconds | 8 |
end | End time of the archive data, in microseconds | 8 |
period | Period of the archive, in microseconds | 8 |
vtp | Type of value in the archive: Boolean, Integer (Int16, Int32, Int64) , Real (Float, Double), String) | (3) |
hgrid | Sign of using of hard grid in the buffer of the archive | (1) |
hres | Sign of using the high resolution time (microseconds) in the archive buffer | (1) |
reserve | Reserve | 14 |
term | Symbol of the end of the file header (0x55) | 1 |
Clarification of the sequential packaging mechanism is shown in Figure 7. As can be seen from the figure, the packaging sign contains the length (non-fixed types) or the packaging sign (fixed types) of the individual value. This means that to get the desired value offset it is necessary to sum the length of all previous valid values. Performing this operation every time and for each value is a very hard operation, therefore, a mechanism for caching the values offsets was introduced. The mechanism caches the values offsets through a certain number of them, and also caches the offset of the last value to which access was made (separately for reading and writing).
Changing the values inside the existing archive is also provided. However, given the need to perform shifting the archive tail, it is recommended that this operation be performed as rarely as possible and as large size blocks as possible.
3 Efficiency
During the design and implementation of this module, mechanisms for increasing the efficiency of the archiving process were laid.
The first mechanism is block (single-frame or transactional) location of data in the value archive file. Such a mechanism allows to achieve the maximum archiving speed, and, accordingly, allows simultaneous archiving of more data streams. Practical experience has shown that the K8-3000 system with a regular IDE hard drive can archive up to 300000 data streams with a period of 1 second, or, the K5-400 system with IDE disk (2.5") can archive up to 100 parameters with a period of 1 millisecond.
The second mechanism is to pack both current values and outdated archive files to optimize disk space usage. Two mechanisms of packaging are implemented: the sequential packaging mechanism (value archives) and the mechanism of compressing archives by the standard packer (gzip). This approach has made it possible to achieve high performance in the process of archiving current data with the efficient sequential compression mechanism, and finishing the outdated archives by the standard packer completes the overall picture of compact storage of large data arrays. The statistics of practical application in the conditions of the real noisy signal (the worst situation) showed that the degree of the sequential packaging was 10%, and the degree of the finishing packaging was 71%.