Current ThreatQ Version Filter
 

threatq:sync-import

This import command processes the tarball of object data created by the threatq:sync-export command. Temporary sync tables are created on the target to house this object data, and integrity checks are run against existing data to verify IDs and check for duplicate objects. Duplicate objects from the source ThreatQ installation are updated, and new objects are inserted. The temporary sync tables are dropped when data processing is complete. Each run of this command also generates a sync report without output logs for the run.

Upon upgrade to ThreatQ 6x, the /var/lib/threatq/agds_transfer directory is created and becomes the default location for exporting and importing AGDS zip files. As such, AGDS commands only need to specify the relative path to the folders you created within this directory for AGDS exports or imports. Then, use the --target parameter to specify the location when exporting the AGDS zip file and the --file parameter to specify the location from which to import the .gz file.

The import process moves data from the source instance to the target instance based on changes to the object's updated_at date.  It does not explicitly remove data from the target instance. 

Parameters

The following table outlines the parameters for the command. With the exception of --file, all parameters for the threatq:sync-import command will use the default value unless otherwise defined by the user.

Parameter Explanation
--file Required value.  File path to the tarball created by the threatq:sync-export command. 
 
Example: --file=/tmp/tqSync-19-01-16-1547660837-8345.tar.gz
Example: --file=imports/tqSync-19-01-16-1547660837-8345.tar.gz
--keep-created-at Determines whether the oldest created_at date between the source and target ThreatQ installations should be maintained, or a new created_at is set on the target system. The default if this option is not provided by the user is for the oldest created_at date to be maintained. Options are Y(es) or N(o).

Default: Y

Example: --keep-created-at=N
--object-limit Integer value used as the limit for the number of objects updated or inserted at a time.  When using this option, the size of the data sets on both source and target ThreatQ installations should be taken into account. Setting the limit too high may hinder performance.

Default: 1000

Example: --object-limit=50000
--memory-limit Sets the PHP memory limit in megabytes or gigabytes.

Default: 2G

Example: --memory-limit=4G
--override-description Determines whether or not the descriptions on existing objects on the target ThreatQ installation are be updated. If an existing object has a NULL description, it will be updated regardless of the use of this flag.

Default: Y

Example: --override-description=N

Examples

Basic Run

sudo ./artisan threatq:sync-import --file=/tmp/tqSync-19-01-16-1547660837-8345.tar.gz

sudo ./artisan threatq:sync-import --file=relative/path/tqSync-19-01-16-1547660837-8345.tar.gz

This example processes all the data in the tarball provided in the --file option, using an object limit of 1000 for all inserts and updates. The created_at date of all transferred objects is updated on the target ThreatQ installation if it is older than the current created_at date (if the object is already present on the source ThreatQ installation). Newly inserted objects keep the created_at date of the source ThreatQ installation.

Set New created_at Dates on the Write System

sudo ./artisan threatq:sync-import --file=/tmp/tqSync-19-01-16-1547660837-8345.tar.gz --keep-created-at=N

sudo ./artisan threatq:sync-import --file=relative/path/tqSync-19-01-16-1547660837-8345.tar.gz --keep-created-at=N

This example processes all the data in the tarball provided in the --file option using an object limit of 1000 for all inserts and updates. The created_at date of all transferred is left alone in the case of object updates, and changed to the current date in the case of new object inserts.

Increase the Object Limit

sudo ./artisan threatq:sync-import --file=/tmp/tqSync-19-01-16-1547660837-8345.tar.gz --object-limit=50000

sudo ./artisan threatq:sync-import --file=relative/path/tqSync-19-01-16-1547660837-8345.tar.gz --object-limit=50000

This example processes all the data in the tarball provided in the --file option using an object limit of 50000 for all inserts and updates. The --keep-created-at option has been left out, so it uses the default setting of Y(es) and created_at dates are retained from the Source system.

Initial Setup

You must run the threatq:fill-sync-hash-column command, before running the threatq:sync-import command on an air gapped ThreatQ installation. This command prepares the database of an air gapped installation to run the threatq:sync-import command. Upon upgrade to ThreatQ version 4.17 or later, several tables include a sync_hash column, which stores an MD5 hash of the unique fields for records in each table. This command fills in the data in this column, before attempting an Air Gapped Data Sync import. Data added after upgrade automatically have their sync_hash columns populated on insert and update, so it is only necessary to run this command once.

The threatq:sync-import command checks for any NULL values in the sync_hash column in the events, indicators, and object_links tables before importing any data, and will fail if any NULL values are found. If the threatq:fill-sync-hash-column command is not run and sync_hash columns are found on the indicators, events, or object_links tables, the import will fail and ask you to run the command to fill that column before continuing.

  1. SSH to your target ThreatQ installation.
  2. Change directories to /var/www/api.
  3. Put the ThreatQ platform into maintenance mode:
    php artisan down

  4. Run the following command:
    sudo ./artisan threatq:fill-sync-hash-column

  5. Run php artisan up to bring ThreatQ out of maintenance mode.
  1. SSH to your target ThreatQ installation.
  2. Put the ThreatQ platform into maintenance mode:
    kubectl exec --namespace threatq --stdin --tty deployment/api-schedule-run -- ./artisan down

  3. Run the following command:
    kubectl exec --namespace threatq --stdin --tty deployment/api-schedule-run -- ./artisan threatq:fill-sync-hash-column

  4. Bring ThreatQ out of maintenance mode:
    kubectl exec --namespace threatq --stdin --tty deployment/api-schedule-run -- ./artisan up

Run Scenarios

Import Success

When a run of this command completes successfully, a report appears in the directory the command was run in. There is also a record in the database synchronizations table for the run. Both of these contain data describing performance metrics and object counts.

Excluded Files

If the --ignore-file-types option was used during creation of the export tarball, then the physical files associated with File objects that have the File Types specified in that option are not available during the import of those objects. If the import command detects that a file is missing from the export tarball, it creates a placeholder file under the same file path as was set on the read box (this is defined in the path field of the File). This placeholder file is a simple text file with the phrase "File excluded from export.". Please be aware that because the original physical file associated to the File object has been replaced, it will no longer be possible to open the physical file on the Details page for that File object.

Import Errors

If a run of this command fails before completion, error messages do not appear in the report file - though they do appear in the laravel log and in the console. There is not currently a means of restarting the command from where it left off. The command must be restarted and will run through all the data again. Any data from the tarball that was written during the previous failed run is updated (rather than inserted again), meaning the end result is the same - all data is transferred from the tarball to the target system.

Data Processing

Data found in CSV dump files for a table from the tarball provided in the --file option is inserted into a corresponding sync table. A sync table is a copy of a base table, with column structure maintained but indexes excluded. Indexes are added to unique columns on sync tables (which are later be used in table joins and where clauses) once data insertion from dump files is complete, since indexes slow the insertion process down.

The naming convention for a sync table is sync_import_<base table name>_<process id>.

Base table: adversaries

Sync table: sync_import_adversaries_12345

All sync tables are removed from the target ThreatQ installation's database once data processing is complete.

Basic Table

A basic table has no foreign keys pointing to other tables in the database. It has a single identifier (id) column for each record. Once all the data stored in the tarball for a basic table has been transferred to a sync table, the sync table has an existing_id column added with a default value of NULL for each record. This column is used to determine whether the record already exists on the target ThreatQ installation. The ID for the record on the target system may be different from that of the record from the source ThreatQ installation, so this existing_id column ensures that data integrity is maintained between the two.

Sample Basic Table:

attachment_types - (id, name, is_parsable, parser_class, created_at, updated_at, deleted_at)

Sample Sync Table created from Basic Table:

sync_import_attachment_types_12345 - (existing_id, id, name, is_parsable, parser_class, created_at, updated_at, deleted_at)

Tables with Pivots

A pivot table has one or more foreign keys pointing to other tables in the database. Once all the data stored in the tarball for a table with pivots has been transferred to a sync table, the sync table has an existing_<pivot>_id column added for each foreign key column, as well as an existing_id column for the record itself (all set to a default value of NULL).

File Output

threatq sync-import File Output and Sync Report

Once all data has been processed, a Sync Report is generated. This file is named after the tarball used in the run, with the extension "-sync-import.txt"

Tarball used: tqSync-19-01-16-1547660837-8345.tar.gz

Sync Report name: tqSync-19-01-16-1547660837-8345-sync-import.txt

threatq:sync-import Command Line Output

Command line output displays command progress and object totals. It is similar to the output in the Sync Report.

Synchronizations

Synchronizations Description
id The auto-incremented ID for the Synchronization record.
type  The Synchronization direction.  Options are export or import.
started_at  The date and time the command run was started.
finished_at  The date and time the command run completed.
config_json  A JSON representation of the command run configuration.
report_json  A JSON representation of the command run parameters (command line options, object counts, tables created, etc).
pid  The process ID of the command run.
hash  Unique identifier for a command run (MD5 hash of the config_json column).
created_at  The date and time the Synchronization record was created.
updated_at  The date and time the Synchronization record was updated.

Record Handling

Hash

The Synchronization record hash column is automatically calculated as an MD5 of the config_json column on record creation.

Initial Creation

A Synchronization record is created at the beginning of a command run, right after all command line options have been processed. Initial creation only covers the type, started_at, pid, and config_json columns. For this command (threatq:sync-import), the type will be "import". The command line option portion of the report_json is added as well, but this column will not be complete until the record is finalized. The finished_at column remains NULL.

Finalization

A Synchronization record is finalized when the command run has completed. At this time, the finished_at column is filled with the completion date and time, and the report_json column is updated to include information about the run (object counts, tables created, etc).