Analysis on Mine Accident Injuries Dataset
Mineral mining has a long history in America and the early practice of mining may date back to the Gold Rush period, 19 century. Mining can make people rich, but can also lead people to death because of the fatal mine accidents. Mining is one of the most dangerous jobs and the history has witnessed numerous terrible mine accidents that took away thousands of people's lives. The Monongah Mining Disaster was the worst mining accident of American history; 362 workers were killed in an underground explosion on December 6, 1907 in Monongah, West Virginia. We may attribute this disaster to the poor working condition and equipment in the old time. However, when modern mining equipment and regulation are applied, mining accidents never stop injuring the miners and ruining their family. On January 2, 2006, the Sago Mine disaster trapped 13 miners for nearly two days; one miner survived. Although efforts have been taken to investigate accidents, advise industry, conduct production and safety research, and teach courses in accident prevention, the situation is still tough and accidents haven't been completely avoided.
- Rongjian Lan: firstname.lastname@example.org
- Yulu Wang: email@example.com
The Data Set
This data set is a documentation of all mine accidents reported by mine operators and contractors starting from 1983. The dataset is derived from US department of labor and contains 165775 records with key attributes like the mine, the controller, the operator, the accident time, degree of injury, accident type, the sub-unit of the mine etc. The data source is linked here: 
To Dodge Severe Mine Accident, Get Experienced!
The working experience is the important aspect that determines the quality of the work in almost all professions. This also applies to mining process. The Scatter Plot visualization is constructed with X-Axis denoting the accident date and the Y-Axis denoting the mining experience of the victims. The color is used to differentiate the degree of injuries with green representing minor injury, black fatality, and pink permanent injury. From the visualization, we can perceive that the most of the severe accidents are caused by workers with less mining experience, while the minor accidents can happen to workers with various mining experience. Therefore, the mining companies will reduce the number of severe accidents by putting more efforts on training their workers, thus reducing the accident-caused revenue.
Don't Blame Bad Luck! Mine Accidents Can Be Controlled
The line chart visualization shows the number of occurrence of mine accidents in different mining sub-unit across ten years period from 2000 to 2010. The X-Axis is used to show calendar year from 2000 to 2010 and the Y-Axis is for the number of mine accidents at each year. Each line in the visualization represent a mining sub-unit and the difference is denoted by colors.
From the first half of the line chart visualization, we can easily see that three sub-units are the main source of mine accidents. They are underground, strip & quary & open pit, mill operation/preparation plant. Fortunately, the lines for these three main accident contributors have significantly noticeable negative slopes, which means that the number of accidents in those sub-units is reduced during that ten year period. In detail, the decrease rate of accidents in the three sub-units is 43.11%, which is good to hear. However, the rest of the sub-units don't seem to contribute to the accidents drop a lot, even though they don't account for the major part of the accidents. To see detail, the second line chart filters out the three major sub-units and visualizes the rest. From the slope of the lines, we can see there is no obvious decrease of accidents.
From the above visualization, we can see that the difference of accidents decrease rate in different sub-units is obvious. This concludes that the decrease in some sub-units must be caused by some external factors other than simple luck. Since the previous ten years has witnessed a huge amount of effort in reducing the chances of mine accidents, we can hypothesize that the decrease in the three sub-units are caused by those efforts. While the stable accidents rate in the rest of the sub-units tell us that our action should not only be taken on the major sub-units, they should also be spent on the other parts of the mining endeavor.
Be Careful When You Start Working!
The bar chart visualization from figure 1 shows the number of occurrence of mine accidents in different times of a day for all months in a year. The X-Axis is used to show calendar month from January to December and the Y-Axis is for the number of mine accidents. Each sub bar in one bar represents a time of a day in an increasing order from bottom to top, the length of which indicates number of mine accidents and is denoted by colors.
The scatter plot from figure 2 is basically showing the same thing as figure 1 does but instead of displaying time as bars from bottom to top in an increasing order, figure 2 sets each time as a circle and puts that in a position where X-Axis still indicates the month but Y-Axis indicates the number of accidents occur in that specific time.
When learning the statistics relations regarding the amount of accidents and time(time of a day, month or quarter in a year) from figure 1, it's surprising to know that last quarter tends to have the least accidents while August has the most accidents in a year. What's more surprising is that this phenomenon doesn't really have anything to do with temperature, otherwise January, a month when it's as cold as December, would have a lower rate of accidents than it actually does. It's more affected by the national holidays when workers don't need to work. And the holidays are more and longer in November, December while none in August.
When it comes to the time of a day when the accidents happen the most, intuitively it would be a night time when people are usually too tired or sleepy to concentrate on their work so that it's more likely to make some mistakes that can cause accidents. But statistics show opposite. Taking a look at the first figure, of all the months, the time when the accidents happen the most is 10AM. And to have a deeper and detailed understanding on how amount of accidents varies depending on time, take a look at figure 2, where darker circles represent earlier time of a day while lighter circles represent later time of a day. All the top circles for each month is 10AM, and the later of a day, the less likelihood that accidents will happen. So if 10AM is the time when workers start working in a day, then they should be very careful as they might just wake up and their brains are not ready for a work that needs carefulness and concentration.
Overall, SpotFire is a powerful integrated visualization systems that features most of the useful visualization and interaction techniques for users to do efficient and meaningful visual analytic. It's a well developed and documented commercial product. The main advantage of SpotFire is its capability of handling several magnitude more data than those academic visualization tools. However, there exists some minor flaws in SpotFire which makes it subject to further improvement.
- When doing the dynamic filtering, the scale of the bar chart changes according to the current maximum value, which is good for analyzing the current data itself but hard for doing comparison when filtering dynamically.
- The parallel coordinate visualization is useful but the interaction is slow when dealing with a data set with hundreds of thousands of records.
- Though the different views are coordinated which means the selected records will be highlighted in all views. The coordinated highlighting in many visualizations such as bar chart and TreeMap, is not meaningful because it simply highlights the whole data group even when it contains only one of the selected records. I suggest adding partial highlighting to indicate the portion of the selected records in the whole group.
- There is no easy way to filter out a single group of records while leave others. For example, if I want to filter out the group with missing data.
- The Scatter Plot overlapping problem is severe when the data set scale exceed the screen capability. If a layer ordering functionality is provided then users can alleviate the overlapping problem by interaction.
- SpotFire doesn't have easy-to-access sorting functionality which is a crucial feature for a visualization system.