OPEN DATA SETS FOR AI IN MINING

The project aims to develop a guideline for the collection, cleaning, labelling, and curating of open data sets to help the industry test and train their models for a variety of AI applications.

Open, curated data sets can bring value to the industry. These data sets provide access to developers and researchers so that they have suitable data to test and train their models on a variety of applications. These data sets can also be used to benchmark various solutions and allow for effective and fair comparison, as well as allowing for research to be repeated and validated. 

The purpose of this project is to build open data sets specific to the mining sector for AI research and development by creating a suitable data set repository to allow for broad industry access and create a process for the data gathering process by:

  • Identifying existing data sets already released to the public.
  • Identifying the types of typical mining data that would be good candidates for building publicly available data sets for broader industry use.
  • Developing a set of principles and guidelines to underpin the collection including the cleaning, labelling, and curation of appropriate data.
  • The execution of several sub-projects to build datasets according to the guidelines and goals developed above.

KEY DELIVERABLES

  • A register of suitable candidate data sets.
  • A set of guidelines for the collection and curation of these data sets.
  • A set of repositories of gathered data

PROJECT LATEST UPDATE

VIRTUAL MEETING. Upcoming Open Data Sets Project Call on September 28th . Click here to register.

September 28th | 8am-9am EDT | 12pm-1pm (UTC) | 8pm – 9pm (AWST)|

Draft is under preliminary editing by Technical Editor.

PROJECT HISTORY

2020 Jul | Project team call

Project team conducted a final revision of the draft before submitting to the GMG Technical Editor for preliminary editing.

2020 Jul | Project presentation at Artificial Intelligence Working Group virtual meeting

Project leaders presented the latest updates about the project, and conducted a mini-workshop during the call to have participants revise and comment the risk assessment. To access the material, please click here.

2020 Jun | Virtual Workshop

The participants worked directly in the document developing content, expanding on topics, providing feedback and sharing references.

2020 May | First project team call

Project leaders presented the outcomes from the workshops and shared the project plan with the volunteers. Subsequently, divided the guideline sections among the volunteers to begin content development. Click here to check out the outcomes.

2020 May | Virtual Workshop

The second workshop had a same dynamic as the first workshop to ensure participation across the globe. All input was analyzed by the Project leaders and the table of content was built. 

2020 Apr | Virtual Workshop

Workshop participates defined the key stakeholders to involve in the project, discussed the benefits of the project, divided the guideline into business and technical side and potential data sets. To access the outcomes, please click here. 

2020 Feb | Project launched at SME Annual Event

Project leader presented about Open Data and how it will enable innovation in the industry. To access the presentation, please click here


X