GUIDELINE FOR SHARING OPEN DATA SETS IN MINING
The purpose of this guideline is to provide mining industry stakeholders with best practices for data sharing so that they can benefit from the opportunities that open data can offer. It leverages and references existing work on data sharing and provides additional context for mining settings. This guideline is directed towards readers who intend to share data with others, those involved in the approvals process, and users who want to use open data shared by the mining industry. The guideline covers key management and implementation considerations.
- Licenses: A data license is typically used before sharing and publishing data to outline the data providers’ intended use while giving them protection. They also provide clarity to the data consumer, preventing them from potentially infringing the rights of the owners. License types can typically be divided into open (without technical or legal restrictions), non-commercial, partially open or restricted usage, and closed. Existing frameworks can be used to cover general requirements.
- Benefits: Sharing data provides benefits, which include supporting innovation and research and allowing the public access to information to help improve decision-making in operations.
- Challenges: Challenges related to cost, legal issues, storage, privacy, and common language associated with the collection, administration, internal communication, and maintenance of open data should be addressed.
- Sharing: It is critical to identify what data should and should not be shared prior to implementation. The data that is shared should be well-documented, reliable, usable, accurate, relevant, and in an accessible format. Sharing any data that contains sensitive information should be avoided unless the risks can be acceptably mitigated (e.g., through anonymization).
- Process for making data open: When making a data set open, it should be submitted in a machine-readable format that is open and logical. If possible, any community consensus on the format or formats of existing data should be prioritized. It is also important to identify the appropriate anonymization requirements and techniques.
- Approval: It is recommended that a formal approval process is adopted when releasing data. The documentation provided for approval to release data typically includes an overview of the original data and its structure, a description of anonymization procedures, an overview of the resulting data, and attestation or “sign-off” from key stakeholders that the data set is acceptable to share.