Version 1.0
28th April, 2022
Data Extraction
We first extract raw data from the data source,
- Data source: e.g. National Archive (https://www.legislation.gov.uk/) , European Union Law (https://eur-lex.europa.eu/)
- Data extracted: e.g. regulation content, commencement date and any associated metadata.
Data Transformation & Enrichment
We transform and enrich the regulation data so that it could facilitate certain features (e.g. filters, grouping) in the application. The transformation and enrichment process include a series of data engineering and data science models.
For example:
- Regulatory Requirement: Identify which sections inside the regulation are requirements for business, or for the authority or it is just general text.
- Sector: Identify which sectors the regulation applies to, currently, the main sector standard is ONS Sector Classification.
- Regulation Relationship: Identify which regulations amend/revoke which regulations, which regulation sector overlaps with which regulations, which Acts made which Statutory Instruments, etc.
- Complexity Score: Assess the level of complexity for businesses to implement the specific regulations.
- Topic: Identify which topics should the regulation be grouped under, currently, the main topic category in the application is policy topics.
- Penalty: Identify whether the regulation specifies any penalty for non-compliance, e.g. fine, imprisonment.
Data Update
We conduct periodic updates of the data i.e. extract new data from the data source and process them with transformation and encirclement as described in the above two section.
The process of new data extraction, transformation and enrichment can take a few days, then it will need to undergo a data quality assurance process which can take 1-2 weeks before we finalize the updated data and inject them into the application. Therefore, it can take approximately a month for new data to be processed and presented in the application.
Currency, the update frequency is monthly but we are aiming to increase it to bi-weekly going forward.