Authors: Narayanan, Lakshmanan ([email protected]), Vaidyanathan, Praveen ([email protected]), Chandrasekaran, Vallab ([email protected])
Supported Versions: 10.5 and above
The purpose of this document is to share product configurations and recommendations that are required to setup API Gateway to handle large volumes of data. These are recommendations arrived at as an outcome of SOAK testing. This document must be treated and read as a case study document and not as an official bench marking document.
The configurations and recommendations provided in this document are based on observations from a soak testing done on a API Gateway cluster. In addition, other tests done are backup, restore, archive & purge (events), restoring events from the archive. Here are the test details |
No | Test type | Test Details | Value |
---|---|---|---|
1 | Soak testing | Test Duration | 100 days |
Transaction per second (TPS) | 200 | ||
Transaction size F | 10 KB | ||
Test completion target | 2 Billion transactions | ||
ES Store size | 2940 GB (with 1 replica) i.e., 1470 GB primary data |
||
2 | Backup | Backup type | Incremental backup (every 8 hours) |
3 | Restore | Transactions count in snapshot (backup) when the restore was attempted | 1.377 Billion transactions |
Snapshot size | 960 GB | ||
4 | Archive & Purge (events) | Transactions count in Elasticsearch when archive & purge was attempted | 1.377 Billion |
5 | Restoring events from the archive | Transactions count in the archive when it was restored | 1.377 Billion |
Purpose | No of nodes |
RAM | CPU | Disk space |
---|---|---|---|---|
API Gateway with Internal datastore Version 10.5 |
3 | 8 GB | 2 GHz * 2 | 250 GB |
Terracotta cluster Nginx Load balancer was installed in one of the Terracotta nodes |
2 | 4 GB | 2 GHz * 2 | 50 GB |
Client - JMeter | 1 | 8 GB | 2 GHz * 2 | 50 GB |
Native server (Integration Server) | 2 | 4 GB | 2 GHz * 2 | 50GB |
Elasticsearch Horizontal scaleup (These VMs are available in a VM pool and added to the ES cluster only when needed) |
6 | 4 GB | 2 GHz * 2 | 500GB |
- Start Terracotta
- Start Elasticsearch nodes
- Start API Gateway
- Stop API Gateway
- Stop Elasticsearch
- Stop Terracotta
This section defines the configurations needed to make API Gateway connect to desired Elasticsearch cluster. Set the below properties in system-settings.yml available \IntegrationServer\instances\\packages\WmAPIGateway\resources\configuration in all API Gateway nodes.
Note: By default, the externalized configuration won't be available. A default template available under \IntegrationServer\instances\\packages\WmAPIGateway\resources\configuration. You need to add the desired settings in system-settings.yml. Then you have to let API Gateway know that this is a configuration file by enabling the file in config-sources.yml by uncommenting the appropriate lines. Please refer to the attached files for reference - config-sources.yml and system-settings.yml
Configuration | Explanation |
---|---|
apigw.elasticsearch.autostart =false | Since the Elasticsearch cluster will be started before API Gateway, setting this to false so that API Gateway won’t try to start Internal data store. |
apigw.elasticsearch.hosts=<ElasticsearchLB or Elasticsearchhost>:<es port> | Example - apigw.elasticsearch.hosts=localhost:9240 It is enough to provide one host and port, provided all the Elasticsearch can be connected via publish address set in Elasticsearch. |
apigw.elasticsearch.sniff.enable=false | By default, this value will be true. So that it will get the list of Elasticsearch nodes available in the Elasticsearch cluster and send request from API Gateway to all Elasticsearch nodes to balance the request across all the nodes. Set this value to false only on below scenarios 1. If Load balancer host and port of Elasticsearch is specified for apigw.elasticsearch.hosts, then this should be set to false. 2. Check the publish address of Elasticsearch clusters. If they are not accessible, then set this to false and provide all host and port in apigw.elasticsearch.hosts property. The publish address can be find using http://:/_nodes/http |
OOTB, Elasticsearch, or internal data store will have a default configuration. Please see the below recommendation to set up the initial Elasticsearch cluster
Configuration | Explanation |
---|---|
Minimum number of nodes | Minimum number of nodes required is 3 |
Set all three nodes as master. | By default, all nodes will be master unless explicitly set node.master as false |
Set minimum heap space as 2gb | Follow below steps to increase or decrease heap space of Elasticsearch node 1. Go to -> <Install_location>\InternalDataStore\config\jvm.options 2. Change the value of property -Xmx<number>g.ex: to increase from 2g to 4g, customer can set the value as -Xmx4g |
node.name | Set a human readable node name by setting "node.name" property in elasticsearch.yml in all nodes |
Initial master nodes | Add all the three node names in initial.master_nodes in elasticsearch.yml. These are the nodes that are responsible for forming a single cluster for the very first when we start Elasticsearch cluster. As per Elasticsearch recommendation add at least three master eligible nodes in cluster.initial.master_nodes |
Discovery seed hosts | Add the three nodes host:httpport as discovery.seed_hosts. Elasticsearch will discover the cluster nodes using the hosts specified in this property. |
Path Repo | Configure the repo to common location that is accessible for all Elasticsearch nodes. All the backups taken using either Elasticsearch snapshot or API Gateway backup utility will be stored here. Refer this article https://techcommunity.softwareag.com/pwiki/-/wiki/Main/Periodical%20Data%20backup 1. Backup to AWS S3 bucket or shared file system options are available, so that the local disk space will not be occupied. |
Refresh Interval | After starting the API Gateway, set the refresh interval for events index type as below 1. Go to API Gateway UI -> Administration -> Extended settings -> eventRefreshInterval to 60s and save it. In Elasticsearch, the operation that makes any updates to the data visible to search is called a refresh . It is costly operation when there are large volumes of data and calling it often while there is ongoing indexing activity can impact indexing speed. The below queries will make the index refresh every 1 minute |
Disk based shard allocation settings | If node disk spaces are equal, then configure it in percentage. Elasticsearch uses the below settings to consider the available disk space on a node before deciding whether to allocate new shards to that node or to actively relocate shards away from that node. 1. cluster.routing.allocation.disk.watermark.low Default: 85% which means Elasticsearch will stop allocating new shards to nodes that have more than 85% disk used 2. cluster.routing.allocation.disk.watermark.high Default: 90% which means Elasticsearch will attempt to relocate shards away from a node whose disk usage is above 90% 3. cluster.routing.allocation.disk.watermark.flood_stage Default: 95% which means Elasticsearch enforces a read-only index block (index.blocks.read_only_allow_delete) on every index that has one or more shards allocated on the node that has at least one disk exceeding the flood stage. This is the last resort to prevent nodes from running out of disk space. The values can be set in percentage and absolute. If the nodes have equal space, then the customer can configure the values in percentage curl -X PUT "http://localhost:9240/\_cluster/settings?pretty" -H 'Content-Type: application/json' -d' { "persistent" : { "cluster.routing.allocation.disk.watermark.low": "75%", "cluster.routing.allocation.disk.watermark.high": "85%", "cluster.routing.allocation.disk.watermark.flood_stage": "95%", "cluster.info.update.interval": "1m" } } If the node disk spaces are not equal, then provide in absolute value. Set the absolute value based on disk size available. Ex: curl -X PUT "http://localhost:9240/\_cluster/settings?pretty" " -H 'Content-Type: application/json' -d' { " persistent" : { "cluster.routing.allocation.disk.watermark.low": "100gb ", "cluster.routing.allocation.disk.watermark.high": "50gb", "cluster.routing.allocation.disk.watermark.flood_stage": "10gb", "cluster.info.update.interval": "1m" } }' |
If External Kibana is used, perform the following steps to turn off internal Kibana and use external Kibana
Configuration | Explanation |
---|---|
Autostart | Go to: <installlocation>\IntegrationServer\instances\<tenant>\packages\WmAPIGateway\resources\configuration\system-settings.yml and set apigw.kibana.autostart=false and update the apigw.kibana.elasticsearch.hosts=http(s)://<eshost>:<esport> Follow this in all API Gateway nodes. |
Dashboard Instance | Go to: <installlocation>\IntegrationServer\instances\<tenant>\packages\WmAPIGateway\resources\configuration\system-settings.yml and set apigw.kibana.dashboardInstance=http://<host>:<kibanaport> (eg: http://localhost:9405) |
Request Timeout | Perform this step irrespective of whether you use external Kibana or the one bundled with API Gateway. Change timeout setting for Kibana. Go to kibana.yml, elasticsearch.requestTimeout property changed from the default value (30s) to 120s. This timeout setting is responsible to make Kibana waits for responses from Elasticsearch. For rendering analytics for high volume (say 500Mn or 1Bn transactions) change this. Note: This setting worked till 1.6Bn after which we increased it to 180s. |
Configuration | Explanation |
---|---|
Watt properties for handling high volumes | watt.security.ssl.cacheClientSessions=true watt.net.maxClientKeepaliveConns=500 watt.server.threadPool=900 watt.server.threadPoolMin=200 watt.net.clientKeepaliveUsageLimit= 10000000 |
API Gateway extended settings for handling high volumes | To handle high TPS between API Gateway and Elasticsearch for logging transactions events.collectionPool.maxThreads = 16 events.ReportingPool.maxThreads = 8 |
1. Create backup
Customer can use the below command to back up the API Gateway data. Go to \IntegrationServer\instances\\packages\WmAPIGateway\cli and execute below command periodically (daily or weekly)
apigatewayUtil.bat/sh create backup -name <backupName> -tenant <default or configured tenant name> -repo <repo_name>
2. Verify the backup
Go to \IntegrationServer\instances\\packages\WmAPIGateway\cli and execute below command
apigatewayUtil.bat/sh status backup -name <backupName> -tenant <default or configured tenant name> -repo <repo_name>
Note For periodical backup, the backup name should be different and meaningful to use it for restore.
3. Housekeeping of backup
Generally, we will take snapshots periodically either daily or weekly or some defined period. It is better to clean up the old snapshots (backup) to clear diskspace of backup after some time or according to your data retention period.
-
List the backups - Go to \IntegrationServer\instances\\packages\WmAPIGateway\cli and execute below command
apigatewayUtil.bat/sh list backup -tenant <default or configured tenant name> -repo <repo_name>
-
Delete the old backups - Go to \IntegrationServer\instances\\packages\WmAPIGateway\cli and execute below command
apigatewayUtil.bat/sh delete backup -name <name of the backup to delete> -tenant <default or configured tenant name> -repo <repo_name>
4. Schedule Periodic backup:
Refer to this article https://techcommunity.softwareag.com/pwiki/-/wiki/Main/Periodical%20Data%20backup Backup to AWS S3 bucket is also available so that the disk space will not be occupied.
To Restore a backup using API Gateway utility tool.
apigatewayUtil.bat/sh restore backup -name <backupName> -tenant <default or configured tenant name> -repo <repo_name>
User can restore specific assets. Use apigatewayUtil.bat/sh -help to get to know more about commands and its options
- Purge from UI:
Users can perform the purge operation through UI. Go to API Gateway -> Administration -> Manage Data -> Archive and purge. Select the desired event type and time duration and click purge. The purge job will get triggered.
-
Purge using Rest Endpoint:
Purge the events using below endpoint
http://localhost:5555/rest/apigateway/apitransactions?eventType=<eventtype>&objectType=Analytics&olderThan=<timeline> eventType: \[ "ALL", "transactionalEvents", "lifecycleEvents", "performanceMetrics", "monitorEvents", "threatProtectionEvents" ,"policyViolationEvents", "errorEvents", "auditlogs", "applicationlogs"\] olderThan: You specify years or months or days along with time Year: <number>Y \[example: 1Y\] Month: <number>M \[example: 2M\] days: <number>d \[example: 90d\]\\ time: <number>h<number>m<number>s \[example: 14h30m2s\]
Example: Purging data that are older than 90days and 2hours 3minutes old
curl -X DELETE -H "Authorization: Basic QWRtaW5pc3RyYXRvcjptYW5hZ2U=" -H "Accept: application/json" "http://localhost:5555/rest/apigateway/apitransactions?eventType=ALL&objectType=Analytics&olderThan=90d2h3m"
The above rest endpoint will return the job id if the request for the purge is successful. Check whether the purge is successful using the below endpoint
http://localhost:5555/rest/apigateway/apitransactions/jobs/<job_id>
-
Schedule Periodic Purge:
It is important to schedule the purge operation using the rest endpoint periodically based on your analytics retention time
Note: Elasticsearch purging is a time, memory, and disk space consuming process. Do this whenever there is less load on the server.
To avoid purge, the user can follow below approach. Users can rollover events related indices (check this document for details on different indices and their usage) on a daily or defined period. The Rollover of an index is nothing but, making the current index having events as read-only and creating a new index for storing new events and linking that to the existing alias. This allows us to delete the oldest index based on the date instead of purging old events.
The deletion of the index is almost instant and the disk space of the oldest events are recovered immediately.
Users can rollover the events-related indices. Please refer monitoring section to check how to rollover an index.
Rollover index options:
-
Daily rollover:
If the user daily rollover the index, user should check the number of shards across the cluster while rollover. Daily roll over the index will increase the shards. Elasticsearch recommends, it is best practice to have 20 shards per GB of heap space allocated.
To overcome the shards increasing rate, based on the above experiment for 200 days 2.9 TB of disk space used. Hence on an average per day 15 GB of disk space is used to store for primary and replica. Based on the above values, for events user can set 1 primary and 1 replica for daily roll over. During the rollover of an index, user can specify the number of primary and replicas for roll over index (new index).
-
Based on Disk Size
If user doesn’t want to alter the number of shards, then they can decide the period based on disk size. Whenever the events index size reaches 25 GB per shard or ( number of shards * 25 GB) then they can roll over the events to new index. But in this approach, the events will be stored in periods and user can delete events in period.
For example: If the transaction events, to reach 125 GB ( 5 primary shards) it takes 10 days then the events will be stored in 10 days period. So, every tenth day user can roll over the index ( create new index to store events ) and delete the oldest index that crosses the retention period.
User can rollover the events related index periodically and can provide index name in the format gateway___yyyymmdd format during rollover
Example to rollover transactional event by date. Creating a new index with date to store data that are generated after 6th Jan 2021 to new index.
curl -X POST "http://localhost:9240/gateway_default_analytics_transactionalevents/_rollover/gateway_default_analytics_transactionalevents_20210106"-H "content-type: application/json" -d "{}"
By this way, whenever we roll over, we can delete the oldest index based on date instead of purging old events.
For example, we can delete the index from 4th October 2020 by just computing the index name as following gateway_default_transactionalevents_20201004. We can delete the index that is older than 90 days (retention period) by just computing the index name. This will be very simple as deletion of index happens instantly and can be done any time.
All events indices beyond a particular month can be easily identified and deleted. If the user wants to delete all events indices created on October2020, they can use the below query to list all the events indices that belongs to October2020 and can delete the listed indices
http://localhost:9240/_cat/indices/gateway_default_analytics_*events_202010*?v&s=i
It is important to manage the log files.
By default, the logs will be stored in the below location
- \IntegrationServer\instances\\logs
- \profiles\IS_\logs
- \InternalDataStore\logs
Please set the properties for each component to enable automatic log rotation
Go to \InternalDataStore\config\log4j2.properties and set the below properties
Key | Value | Possible values |
---|---|---|
appender.rolling.strategy.action.condition.nested_condition.type | IfAny | IfAny/IfAccumulatedFileSize |
appender.rolling.strategy.action.condition.nested_condition.exceeds | 256MB | File Size units : MB/GB |
appender.rolling.strategy.action.condition.nested_condition.lastMod.type | IfLastModified | |
appender.rolling.strategy.action.condition.nested_condition.lastMod.age | 7D |
Change the below watt properties to enable log rotation for the Integration server
Key | Value | Comments |
---|---|---|
watt.server.serverlogFilesToKeep | 100 | |
watt.server.serverlogRotateSize | 10MB | File Size units : MB/GB |
watt.server.audit.logFilesToKeep | 100 | |
watt.server.audit.logRotateSize | 10MB |
To enable log rotation for OSGI and wrapper set the below properties
OSGI Logs : /opt/softwareag/profiles/IS_APIGateway/configuration/logging/log4j2.properties
Key | Value |
---|---|
appender.rolling.policies.size | 10MB |
appender.rolling.strategy.max | 30 |
Wrapper Logs: IS_APIGateway/configuration/custom_wrapper.conf
Key | Value |
---|---|
wrapper.logfile.maxfiles | 30 |
wrapper.logfile.maxsize | 10MB |
To enable log rotation for kibana please set the below properties in \profiles\IS_default\apigateway\dashboard\config\kibana.yml
Key | Value |
---|---|
logging.dest | ./kibana.log |
logging.rotate.enabled | true |
logging.rotate.everyBytes | 10485760 |
logging.rotate.keepFiles | 30 |
curl -X "GET" "http://localhost:9240/_cat/shards?v&s=store:desc"
This will display all the shards with disk space sorted in descending order From the response, get the disk size used by each shard. If the shard disk size is about to reach 25GB or equals to or more than 25 GB, then take the below actions.
Any one of the below actions can be taken to recover disk space
- Purge the data corresponding to that index if it is events. Refer purge section
(or)
-
By default, the API gateway has created an alias for all events. Below are the aliases (you can list them using this URL http://localhost:9240/\_cat/aliases?v) and you can find the corresponding index by checking http://localhost:9240/<aliasname>. It will display the current write index. Below is the list of aliases in API Gateway 10.5
- gateway_<tenant>_analytics_transactionalevents
- gateway_<tenant>_analytics_performancemetrics
- gateway_<tenant>_analytics_policyviolationevents
- gateway_<tenant>_analytics_lifecycleevents
- gateway_<tenant>_analytics_errorevents
- gateway_<tenant>_audit_auditlogs
- gateway_<tenant>_analytics_monitorevents
- gateway_<tenant>_log
- gateway_<tenant>_analytics_threatprotectionevents
To rollover, an index, follow the below steps - creating a new index to write all data to that index, and the old index will become read-only.
curl -X POST "http://localhost:9240/<alias>/_rollover/<new_index_name>" -d "{}"
Note: API Gateway already created templates for adding mappings and settings for rollover index created automatically. Hence new index name should start with an alias name appended with any applicable character allowed by Elasticsearch.
Example: To rollover transactional events the request should be
curl -X POST "http://localhost:9240/gateway_default_analytics_transactionalevents/_rollover/gateway_default_analytics_transactionalevents-000002"-H "content-type: application/json" -d "{}"
Use below curl command to get the disk space of es nodes
curl -X GET http://localhost:9240/_nodes/stats/fs
It will list disk space available in all nodes. To get the disk space use the below json path expression
Total diskspace -> $.nodes..fs.total.total_in_bytes
Free diskspace --> $.nodes..fs.total.free_in_bytes
Available diskspace -> $.nodes..fs.total.available_in_bytes
To know the configured disk watermark in Elasticsearch use the below command
curl -X GET http://localhost:9240/_cluster/settings?pretty
It will return the configured disk watermark as response
To Get the different level of water mark use the below json path expression
low --> $.persitent.cluster.routing.allocation.disk.watermark.low
high --> $.persitent.cluster.routing.allocation.disk.watermark.high
flood --> $.persitent.cluster.routing.allocation.disk.watermark.flood
Convert the disk space in bytes to GB (bytes/ (1024*1024*1024))and calculate the available diskspace percentage.
To know about any metrics customer check process specific metrics customer can use below curl command
curl -X GET http://localhost:9240/_nodes/stats/<metric>
List of metrics can be found https://www.elastic.co/guide/en/elasticsearch/reference/current/cluster-nodes-stats.html
If the available disk space matches the criteria defined in disk watermark, if the available disk space is less than what is configured in $.persitent.cluster.routing.allocation.disk.watermark.low then do one of the below actions
- Purge (refer purge section)
- Scale up the node (refer to scaling section)
- Add additional disk space to the existing node
To Know the cluster health use below
curl -X GET http://localhost:9240/_cluster/health?pretty
It will respond with cluster health status.
- From the response check the health status by using the JSON path expression $.status
- From the response check the number of by using JSON path expression $.number_of_nodes
If cluster health is any of the below colors
-
Green: No Action needed
-
Yellow: When Elasticsearch has huge data, it will take some time to become a green wait for 5 to 10 mins to become green. If it does not become green, we need to identify the cause for yellow and rectify that. During this time Elasticsearch will be able to process requests for the index that is available.
- If there are unassigned shards, then need to know the shard unassigned status and act accordingly.
-
Execute this command to check the list of shards unassigned.
curl -X GET “http://localhost:9240/_cat/shards?h=index,shard,primaryOrReplica,state,docs,store,ip,node,segments.count,unassigned.at,unassigned.details,unassigned.for,unassigned.reason,help,s=index&v”
-
Execute this command to check un allocation reason for specific shards
curl -X GET "http://localhost:9240/_cluster/allocation/explain" -d ‘{ "index" :"<index name>","primary" : "<true|false>","shard": "<shardnumber>"}’
-
- If there are unassigned shards, then need to know the shard unassigned status and act accordingly.
-
Red: This might occur when Elasticsearch nodes are down or not reachable or master is not discovered. If the number of nodes does not match the number of Elasticsearch nodes configured, identify the node that didn’t join the cluster and check that node.
To get the number of shards on Elasticsearch use the below command
curl -X GET "http://localhost:9240/_cluster/health?pretty"
If the total number of active shards from the response exceeds the (heap space * nodes * 20 ) then we need to increase the heap space of Elasticsearch nodes or add a new Elasticsearch node.
As per Elasticsearch recommendation, max 20 active shards per GB of heap space is considered healthy.
- Scale up the Elasticsearch node. Refer to the scaling section.
- If customers are not able to scale Elasticsearch for some reason, they can increase the heap size as the last option. The heap space should not be more than half of system memory( RAM). (ex: If system memory is 16 GB user can allocate a maximum of 8 GB for ES and not more than in any case)
API Gateway provides 2 key endpoints for monitoring API Gateway health. Refer to the details of these endpoints in the user guide.
curl -X GET "http://localhost:5555/rest/apigateway/health/engine"
curl -X GET "http://localhost:5555/rest/apigateway/health/admin"
Additionally API Gateway also provides endpoints for metrics - http://localhost:5555/metrics
- Results from the above endpoints will point you to the problem area. Take appropriate actions.
- Disk size
Refer monitoring section to how to check disk size. If the available disk space matches the criteria defined in the disk watermark, then we need to add nodes.
- Number of Shards
Refer Number of shards section under monitoring to get the number of shards.. For example, if 3 Elasticsearch node is configured with 4 GB of heap space then we can have 4 * 3 * 20 = 240 active shards. Refer to Elasticsearch scale-up section
- Install Elasticsearch or internal data store in the new node
- Configure the desired heap space.
- To add data node set node.master: false and node.data: true in Elasticsearch.yml file
- To add master node set node.master: true in Elasticsearch.yml
- Configure the path.repo as it is available in the other Elasticsearch nodes
- Set the discovery seed hosts to corresponding nodes that are in cluster.initial_master_nodes or provide hosts of stable master nodes. Other nodes will be automatically discovered.
- To remove a data node just shut down the node.
- To remove a master node that is not configured in cluster.initial_master_nodes
- Don’t remove the master node that is configured in cluster.initial_master_nodes as the Elasticsearch nodes will not come up if all the nodes specified in the cluster.initial_master_nodes are not available.
API Gateway cluster setup with the above-mentioned tuning can serve up to 200 TPS. To serve more TPS, you can scale up the API Gateway node. Monitor the threads usage and memory utilization for scaling criteria
- Scaling up an API Gateway would mean adding a new API Gateway node to an existing cluster. Refer to the Clustering API Gateway section in API Gateway documentation for the details.
- If the API Gateway node is configured properly for the cluster, you can add the new node to the load balancer or add the IP of the new node to DNS server if the LB is configured to use DNS load balancing. Setting "portClusteringEnabled" to true in all nodes helps this node to inherit the port settings and can start serving the requests immediately.
- In a paired deployment setup, if a new node is getting added to DMZ, connections must be established explicitly from all nodes in the green zone to DMZ. One could use the API Gateway REST API to automate these port settings. Then the new node can be added to LB as said above.
- Put the node in "Quiesce" mode. This will start rejecting the requests and LB routs the request to other healthy nodes. Allow some cooling period for in-flight transactions to complete. Bring the instance down and remove the same from LB.
- Scaling down is not straightforward for Paired Gateway because of P2P communication.
- To scale down the DMZ nodes, remove them from LB.
- To scale down the green-zone nodes in paired gateway setup, disable the internal ports using REST API. Bring the instance down. In-flight transactions would fail as the communication channel is closed.
Separation of core data and analytics data is recommended for customers managing large transactions volume. Customers can use the external Elasticsearch destination feature to store all the events to separate Elasticsearch instances. This way users can separate runtime events and API Gateway core data. API Gateway core data generally will be in very less size when compared to events. Taking backup of core data will be easy and restoring the core data alone will be fast and easier to manage.