Analysis of Science and Technology Development Across the World from 1995 to 2010
Across the world, the science and technology development of nations and regions signiﬁcantly varies. Usually it is difficult to infer useful information by only looking at the tremendous amount of raw data collected from several decades and hundreds of nations. Therefore, a good information visualization is desirable for people to effectively explore the data to understand the progress of science and technology development in various nations and regions.
In our project, we mainly focus on the exploration of science and technology development of nations and regions across the world from 1995 to 2010. Are there any interesting facts available if we dig deep into the data? Data itself will tell us everything.
- Fan Yang, email@example.com
- Sheng Zha, firstname.lastname@example.org
Raw Data Source
The original raw data is publicly available at the World Bank website, which can be found from http://data.worldbank.org. The data set is divided into 18 categories, each of which contains more than 200 nations and 15 to 50 attributes and ranges from 1960 to 2011. Unfortunately, there is a huge number of missing data in early years and from underdeveloped nations, which makes freely choosing attributes to analyze inapplicable. Therefore, we need to preprocess the raw data and choose the attributes with most information before starting the exploration.
First, we only extract the data from Science and Technology Development category ranging from 1995 to 2010 for our analysis as we care more about the recent development. Then we choose relevant attributes with sufficient values and rearrange years, which are originally separated attributes, as a single column. Finally, we have a new worksheet containing 4182 items including 246 nations and regions, and 66 attributes which are country name, year and topic-related attributes, ranging from 1995 to 2010.
Some important attributes are
- Scientific and technical journal articles
- Research and development expenditure (% of GDP)
- Patent applications, residents
- Researchers in R&D (per million people)
- High-technology exports (current $US)
- Royalty and license fee receipts
- School enrollment, tertiary (% gross)
In our analysis, we use Spotﬁre as major software to explore the data and Tableau as an complementary tool to extend our analysis.
Number of researchers relies on both tertiary education rate and expenditure on research
If nationals receive better tertiary education, do they contribute more to the number of researchers in R&D field after graduation? On the other hand, does a nation’s investment on research attract more people to enter research field? With these questions in mind, we ﬁrst analyze the correlations of the research and development expenditure, tertiary enrollment rate and researchers in R&D per millon people of various nations in 2006, as shown in Fig 1.
Figure 1: Illustration of researcher number with respect to tertiary enrollment percentage and research and development expenditure. The x-axis indicates tertiary enrollment rate and y-axis indicates researchers in R&D (per million people). Spots represent nations and larger spots means higher research and development expenditure. Names of some representative nations are also shown.
We easily observe a strong correlation between tertiary enrollment rate and number of researchers: the increase in the tertiary enrollment rate indeed increases the number of researchers per million people. Accordingly, if the percentage of nationals enrolled into universities and colleges is low, researchers inevitably becomes insufficient, which is clearly shown in the scatter plot where nations on the bottom left have both low tertiary enrollment rates and researcher counts.
However, there are also special cases available which are Venezuela and Ukraine. Although they have very high tertiary enrollment rate, the researchers per million people are significantly low. When we vertically inspect the nations with more than 70% tertiary enrollment rate, the reason pops up that lack of research investment leads to insufficient researchers despite of high tertiary enrollment rate. Therefore, a nation should provide enough ﬁnancial support to research and development even nationals have received good higher education; otherwise they may be reluctant to enter research ﬁeld, which leads to slow progress in science and technology.
When digging deeper into the visualization, we ﬁnd another fact that European nations have higher tertiary enrollment rate and research expenditure, therefore have more researchers per million people, for example, Finland, Sweden and Denmark, which again supports our ﬁnding that researcher number relies on both tertiary enrollment rate and research expenditure. We then use Tableau to extend out ﬁndings. The top map in Fig 2 shows the correlation between average researcher number and tertiary enrollment rate throughout 1995 to 2010 all over the world while the bottom map in Fig 2 illustrates the relationship between average research number and research expenditure.
Figure 2: World map illustrating the impact of two factors on researcher number. Top: the correlation between average tertiary enrollment rate and average researcher number through 1995 to 2010. Larger spot means more researchers per million people. Red means high tertiary enrollment rate while green means low tertiary rate. Bottom: the correlation between average research and development expenditure and average researcher number through 1995 to 2010. Larger spot means more researchers per million people. Red means high expenditure while green means low expenditure.
Clearly, European nations are likely to invest more money on research and improve tertiary education therefore have more researchers per million people. Other nations aiming to improve science and technology competence should learn from those European nations.
A few nations control most technology
Next, we investigate multiple attributes related to science and technology development across different nations to ﬁnd possible remarkable patterns. We inspect four attributes which are high-tech exports (current $US), patent applications, royalty and license fee receipts and scientific and technical journal articles. These factors are important indicators of the level of science and technology development in a nation.
Fig 3 shows the treemaps of the four factors. The size of rectangles indicates the amount of each factor while the color indicates the number of researchers per million people.
Figure 3: Illustration of amounts of four types of scientific and technical products shared by nations through 1995 to 2010. Darker color represents more researchers per million people. From top to bottom: total number of scientific and technical journal articles, total number of patent applications, total amount of high-tech exports (current $US and total number of royalty and license fee receipts.
Instantly we can draw a conclusion that a small number of nations seize more than half of the total scientific and technical products in the world. In particular, United States, Japan, United Kingdom, China, Germany and France are the most powerful nations in science and technology, which are just the top 6 nations with the largest GDP in 2010, according to the United Nations’ report. It means that richer nations have more capacity in developing science and technology; however, poor nations such as nations in Africa involved in ﬁghting with disease and hanger are not able to develop technology at all.
This is a clear-cut example of the so-called “digital divide” phenomena which aggravates the imbalance in science and technology between rich and poor nations and leads to scientific and technical monopoly as the result of globalization. The global community should pay attention to this as the “digital divide” may bring in instability to the world.
The dominance of US in science and technology is being challenged
Finally, we inspect the trend of science and technology development of the United States which is the most powerful nation in science and technology in the world. We choose four attributes and show that the United States will maintain the overlordship in the foreseeable future but its dominance is being challenged in some aspects.
Fig 4 shows the trends of the top 10 nations with the largest GDP according to the United Nations in 2010 from 1995 to 2010 using Tableau. The thickness of lines indicates the research and development expenditure, which remains stable for each nation.
Figure 4: Illustration of trend of science and technology development of top 10 nations with the largest GDP from 1995 to 2010. Top: trends of number of scientific and technical journal articles and royalty and license fee receipts. Bottom: trends of high-tech exports and number of patent applications. Nations are represented by different colors. The thickness of lines indicates the amount of research and development expenditure. Nations listed in the right are ordered according to the GDP amount.
Clearly, the United States is far beyond other nations regarding the number of scientiﬁc and technical journal articles and royalty and licence fee receipts. The two factors reﬂect a nation’s current theoretical contributions to and previous achievement in science and technology. Considering the huge superiority, we believe that the United States will still dominate the two ﬁelds in a few decades.
On the other hand, China becomes active in high-tech exports and patent applications from 2000, resulting in a rapid ascent curve, while the two attributes in the United States are stable and the number of patent applications in Japan even has dropped. The reason for that may be the national policy on stimulating science and technology development in recent decades in China. Therefore, the United States needs to ﬁnd solutions to boost the high-tech exports and patent applications and to ﬁnd out whether the current situation is a signal of insufficient innovation; otherwise, the dominance of the United States in science and technology will be shaken.
Spotﬁre supports various kinds of visualizations from which we are most impressed by the scatter plot. Using scatter plot, information is more condensed where at most 4 attributes are shown within a single ﬁgure, which is very helpful when we need to analyze the correlations of multiple attributes at the same time. It is also good at dealing with empty values and loads data fast.
Some bugs and issues are listed as follows.
- From our experience, if years are listed as separated columns, Spotﬁre could not effectively use them as horizontal axis to show the trend. A relaxation on the data format would be better.
- It cannot handle noisy data well in scatter plot visualization, especially when there is a single value much larger than all the other ones. Spotﬁre tries to enclose all values in the plot instead of wisely
throwing away the noisy value.
- It is difficult for us to accurately pick up the attribute we want to show since the drop-down menu is small. We would favor a highly-customized drop-down menu.
- When drawing lines, the thickness of lines cannot adapt to any measure, which is not ﬂexible.
The interface of Tableau is more attractive and friendly. We are excited that we can freely drag and drop attributes into ﬁlters and marks, which greatly improves working efficiency. Another impressive functionality is that Tableau can read nation names and map them on the real world map. In our analysis, it only made one mistake in recognizing nation names that can be easily corrected by hand. It is also very helpful by using additional attribute to control the thickness of lines.
Some negative experiences are listed as follows.
- We feel that the data loading speed of Tableau is slower than that of Spotﬁre. Moreover, after loading data, users cannot ﬁlter out irrelevant attributes before doing data exploration.
- Sometimes Tableau cannot successfully determine the type of attributes. It continuously treats numerical type values as string type values.
- It is hard to ﬁnd how to directly convert the current visualization to another type of visualization rather than creating a complete new one.
We include the final Application Project Report here. Application Project Report