Anurag Bhatnagar's Blog

Predicting Disease Outbreaks Using Semantic Analysis

Introduction
While global pandemics have been a common occurrence for much of human history, their damage was limited to the small geographical regions where the disease originated. Now, with the rapid globalization of the world, a disease originating in a remote province in China or Mexico can spread across the globe in a matter of a few months, causing thousands of lives (Flahault 2010, 320). In an effort to contain a new disease in an early phase before it becomes a global pandemic, health agencies use advanced surveillance techniques to monitor developments in infections around the world.

The methodology employed by these systems is focused on being able to detect when and where a serious case might have been reported, by scraping news stories from various sources on the internet and looking for symptoms that might indicate an outbreak (Government of Canada 2004). This allows the systems to be able to alert the responsible authorities in the first few weeks of an outbreak. While this technique is effective in its goal of detecting diseases before they become pandemics, it misses an opportunity to achieve a larger objective – predicting areas where an outbreak is imminent and containing it even earlier.

“HealthMap identified a new pattern of respiratory illness in Mexico in 2009 well before public-health officials realized a new influenza pandemic was emerging. Yet, researchers didn’t recognize the seriousness of the threat it posed.” – Dr. Brownstein, Creator of HealthMap (Weeks 2011)

A quick study of the pandemics over the last decade indicate that while global disease monitoring systems did alert the authorities within the first few weeks of an outbreak, it was already too late to contain it (Weeks 2011). Therefore, a system that can identify where the next disease outbreak is most likely to occur, will be more effective in obtaining these objectives. The methodology required to make such predictions is not new, but already used in several other applications, such as predicting movie sales based on sentiment analysis of social media feeds and predicting stock prices, based on how a new product or service is received by the consumers. In a similar technique, a semantic analysis of the content from the web, combined with a knowledge base of data that is trained to look for clues of disease outbreaks based on historical patterns can be used to predict where a disease outbreak is most likely.

Current Disease Surveillance Systems
Global and national health agencies have developed systems that they use to scan the content on the internet for threats they believe might develop into global pandemics. These systems rely on complex software to perform this scan because of the amount of content that needs to be monitored. As the system scans the internet, it produces results that are of interest to the analysts, that study it further to determine if it is a genuine threat (Government of Canada 2004).

“It’s still a lot of information, a lot of noise. In hindsight, you can say this was the first indication, but … when you look at all these reports coming in from all over the world on a daily basis, it is hard or impossible to tell which of these reports constitutes a real threat.”

– Dr. Gunther Eysenbach, University of Toronto’s Centre for Global eHealth Innovation (Blackwell 2009)

Global Public Health Intelligence Network
Global Public Health Intelligence Network (GPHIN) was developed by the Public Health Agency of Canada and is the world’s premier disease surveillance system. Its objective is to scour the web using a text mining engine to look for symptoms of diseases or reports of contaminated food or water supplies and, if necessary, generate an alert to an analyst. The GPHIN system uses a text mining engine developed by Nstein that is also capable of translating news articles between 7 different languages, to maximize its sources of data (Government of Canada 2004). This system has been successful in detecting more than half of the 578 outbreaks identified by the World Health Organization between 1998 and 2001. It is also responsible for being the first disease surveillance system to detect news stories of an outbreak of SARS and Avian Flu, in the very early stages, from the Chinese and the Mexican media respectively (Blackwell 2009).

HealthMap
The HealthMap project, developed by researchers at the Children’s Hospital Boston, has a similar objective. Their software system scours the web, news websites, government and non-government organizational websites and social media feeds to collect data that it analyzes to produce a visualization of the current state of the diseases around the world (Freifeld 2008, 151). The system is capable of mining content from 9 different languages. It filters the content based on keywords of symptoms of diseases common in humans, animals and plants, then further extracts the time and location information from the source. Finally, it uses this data to generate a map of the world, highlighting those regions that have the most cases reported of a specific disease (Freifeld 2008, 152).

“Early detection of disease activity, when followed by a rapid response, can reduce the impact of both seasonal and pandemic influenza. One way to improve early detection is to monitor health-seeking behavior in the form of online web search queries, which are submitted by millions of users around the world each day.” (Ginsberg 2008, 1)

Google Flu Trends
Google’s Flu Trends is another prominent disease detection system. It has a slightly different objective from GPHIN and HealthMap – it is limited to identifying areas of high flu activity. Their methodology towards generating the map of their results is also very different from GPHIN and HealthMap. Rather than using any complex semantic analysis software, Google Flu Trends focuses on studying patterns in Google web searches to determine what areas might be experiencing a spike in flu related cases. This information is then transmitted to the local health agencies and hospitals which can prepare for the spike in cases and flu vaccines. Google Flu Trends currently works in 15 countries and only relies on data from Google search queries (Ginsberg 2008, 1).

Historical Approach

“Outside experts say programs like it [GPHIN] and Harvard Medical School’s HealthMap are useful, but only when combined with other sources of information that, pieced together, paint a bigger picture.” (Blackwell 2009)

The traditional approaches to the problem of containing global pandemics has been to detect them as early as possible through the means of inter-agency co-operation at national and international levels. This meant that the agencies would share data between each other in an effort to develop an understanding of the current state of the diseases around the world.

This approach was refined with the advent of the internet, as large amounts of data was available for study and analysis. Organizations made use of text mining techniques and semantic analysis to data mine the internet for articles relevant to their research, and analyze them further (Weeks 2011). The current methodology common to most systems can be described in the process below (Freifeld 2008, 152).

Scraping – Mine news websites, government and NGO websites for textual information containing keywords of symptoms of diseases or relevant phrases from a predetermined dictionary database.
Data extraction – Extract the useful information from the text, such as the title, author, source, date, body and the URL.
Semantic Analysis – Extract information from the content about its location, time, and disease related information, such as symptoms and description.
Generate Result – The data produced is transferred to analysts that examine it further to assess the risk posed.

Figure. 1 HealthMap System Architecture

Figure 1 illustrates the architecture used by HealthMap to generate the map of areas suffering from disease outbreaks. Most systems that use this methodology are able to analyze sources from multiple languages by using language translation tools and dictionaries in the second step (Freifeld 2008, 153). Such an approach proves to be quiet effective in filtering the content and producing relevant results. For example, using this methodology, GPHIN was able to detect a news story out of Guangdong province of China, within weeks of the first SARS case, as a outbreak of a “respiratory” disease (Blackwell 2009). It was also able to detect a news story out of Mexico in 2009 that reported deaths caused by a “strange epidemic outbreak” of a flu like illness, weeks before it was known as H1N1 flu to the world (Ibid).

Pitfalls of the Historical Approach
The objective of developing these advanced software systems is to be able to contain the spread of a new disease in as early a phase as possible, before it becomes a global pandemic (Government of Canada 2004). However, the historical approach to solving this problem has been focused on detecting the millions of cases of diseases around the world, and then analyzing them further by time, location, severity and symptoms to determine if they are a global threat. This approach dilutes the original objective of preventing disease outbreaks, and re-channels it towards detecting disease outbreaks, and then alerting the authorities. This approach clearly does not address the underlying problem, because even though it is able to make use of technology and the data available to us to detect a disease outbreaks weeks before it is a pandemic, it does not go far enough in taking advantage of the technologies and data available to us.

The scope of the data mined by historical approaches is limited to a few languages and sources such as large public news outlets. Due to this, the system inherits a bias in its data towards certain regions of the world.

Data Sources
One of the biggest shortcomings of the historical approach is its limited scope of data sources. The system can inherit a bias into its content based on the data sources it monitors, and the languages it supports. If the data source includes RSS feeds of prominent news outlets, social media feeds and search queries, as most of the current systems do, the system is ignoring a large section of the content that could potentially contain very significant information. A lot of the reports that contain information that can help determine areas of concern would come from reports published by non-governmental agencies, aid agencies, other international monitoring agencies, talks at a conference and even blogs. These sources might not include reports of actual cases of a disease outbreak, but a study of their content can certainly reveal signs that an outbreak might be imminent. For example, Doctors Without Borders might publish a report detailing deteriorating water standards in a specific region of the world, which might get overlooked by current text mining engines because it does not contain certain keywords or phrases, however, the information gathered from it might lead an intelligent system to flag such an event for a possible area of a disease outbreak.

The text mining engines used by current surveillance systems are limited in their functionality to getting the time and location information of an event, rather than learning new facts from the context of the article.

Semantic Analysis
The methodology of the historical approaches uses the advances in natural language processing in a very limited way. While they use a data mining engine to learn time and location information from a data source where it found matches for certain keywords or phrases, it fails to develop a knowledge base of facts that it can then use to perform a more effective analysis of a situation. It also fails to perform a thorough semantic analysis on the original content to scan for information that could be relevant to a disease outbreak, by just performing a keyword analysis (Freifeld 2008, 157).

Threat Analysis
The methodology for threat analysis differs from system to system, however none of them is advanced enough to analyze the data by comparing it to a knowledge base of facts, and determine how severe or unusual a threat might be (Weeks 2008). For example, while GPHIN was able to detect the story about the outbreak of SARS in 2002 from a Chinese media source, it failed to perform an automated threat analysis of the story, and was not successful in alerting the global health agencies in time to contain the outbreak within the province of Guangdong (Ibid).

Predict, not detect

“The next noteworthy outbreak may as easily come from a major urban center in North America as a rural village in Africa.” (Freifeld 2008, 152)

A smarter and more effective approach to preventing an outbreak of a global pandemic is to be able to predict which region of the world is the most likely breeding ground for one. In such a scenario, prevention of the disease could precede cases of the disease outbreak being reported. While it is true that certain parts of the world will always be more prone to an outbreak because of the lack of public health infrastructure or the sanitary conditions that persist, but a new strain of a disease that has the potential to spread globally as a pandemic can come from any part of the world. “The next noteworthy outbreak may as easily come from a major urban center in North America as a rural village in Africa. ” (Freifeld 2008, 152).

Determining Factors

Parts of the world that are at high risk of an epidemic could be determined by factors such as animal farming practices in a local community (Cyranoski 2008), or food growing practices, sanitary conditions, quality of water and food sources (Piarroux 2011, 1162), civil unrest and even a humanitarian crisis. A system that is able to monitor the internet for these factors, and build a knowledge base of facts that it uses to analyze the reports will be far more effective in reaching the objectives, than a system that relies on detecting an outbreak from news stories after it has occurred. This has been true for SARS and Avian flu, where the conditions for the birth of such a disease could have been predicted had a system been in place to monitor for such factors.

“The concentration of humans or animals in proximity enhances potential transmission of microorganisms among members of the group. It also creates greater potential for infecting surrounding life forms, even those of different species. The conditions created also may be a breeding ground for new, more infectious, or more resistant microorganisms. ” (Gilchrist 2006, 313)

Methodology for a Predicting System

Developing a knowledge base of facts and a Bayesian network to determine what factors could indicate an outbreak can vastly improve the effectiveness of disease surveillance systems.

The goal of predicting disease outbreaks rather than detecting them is not beyond the reach of current technological capabilities. Software systems that are able to filter the data for facts and use it to perform a sophisticated analysis already exist, and are commonly used in other applications such as predicting civil uprisings (Leetaru 2011), consumer behavior (Harrison 2012) and even movie ticket sales (Yu 2012, 720).

Knowledge Base
To be able to apply a similar approach in the case of disease prevention requires a modification to the current methodology and the software architecture. The new software architecture will be required to have a separate database of facts, that can be manually fed into the system, or that can be “learned” from a data source, called a Knowledge base. This component of the design will be responsible for making informative decisions based on the data it gathers and the facts it already knows. It will be accessible by the team of medical researchers and technical staff that maintain the system, so that they can update the knowledge base accordingly.

The knowledge base will consist of facts that it needs to make decisions, such as:

Factors that can onset a disease outbreak

Patterns of a historical disease outbreak
Bayesian network of symptoms, their classification and relations to diseases
Bayesian network of diseases and their relation to age, gender and elasticities
Economic data of a country – Gini index, accessibility to remote areas of the country
Data on the public health institutions – number of hospitals, number of doctors per people
Data on political stability

The knowledge base is the most significant modification to the existing methodology. It will evolve as it learns more data and improve its prediction algorithm over time. It will act as a brain of the new system, that processes the information to be able to perform a threat analysis of a disease, a symptom or any event. For example, when this system learns of an earthquake in Haiti of a large magnitude, it will be able to process the geographical, economical and demo-graphical information through its knowledge base and predict that certain types of diseases that are common in conditions of deteriorating sanitary conditions are likely, such as Cholera, as was the case after the 2008 Earthquake (Piarroux 2011, 1162).

Data Source
One of the most important shortcomings of the current methodologies that needs to be addressed is the small scope of their data sources. The effectiveness of the new system is directly proportional to the breadth of data fed into it, because as it scours the data, it looks for facts and stores them in its knowledge base. Therefore, the more the system scours, the more it learns and the better its prediction algorithm. While increasing the data sources and the format of sources the system accepts to include blogs, non-governmental websites, hospital records, conference discussions, journals and social media feeds is not a challenge, it is quiet important to assure the quality of content reported from these sources. Therefore, the scanning engine will determine the source of the article, and look up its references and overtime develop a database of good quality sources that it will use to filter out its data gathering.

Semantic Analysis
The semantic analysis performed under this methodology is to look for more than just keywords of symptoms or phrases common in disease descriptions. It will also include a scan for factors that indicate a likely disease outbreak, such as news stories of bad animal or food farming practices or suspicious deaths. It will also scan for news stories of a humanitarian crisis in a region of the world which is not equipped to deal with it. It will scan the content and look for articles that might indicate factors that have been flagged in the knowledge base. The semantic analysis will also conduct a temporal and contextual analysis on the article to learn information about the time, location and the context that the event has occurred in. It will use facts from its knowledge base to improve the analysis. Most importantly, it is at this step that the semantic analysis engine will use a pattern dictionary approach to translate content from as many languages as it has capability in its language module for. It will look for words or phrases and using a dictionary translator, look for those keywords and phrases in the source from another language as well. Overtime, it will learn from this pattern dictionary approach and improve its language module of the semantic analysis engine as well.

Threat Analysis
This is one of the most significant steps of the methodology for a system that is able to make predictions. It is at this step that the system analyzes each data it has collected, and uses its contextual, language, location and time information to co-relate it with the information it has about it in the knowledge base, and produce a severity rating. The severity rating of each data is based on the analysis of its content, such as the event that has happened, and combining it with the facts it has in its knowledge base that allows the system to make intelligent predictions. The system will make use of the data it gathered from the semantic analysis and see if the facts in the knowledge base, such as the Bayesian network or other factors together indicate if this event poses a serious threat. If it does, it will assign it a rating based on the quality of the source, the historical pattern and the socio-economic indicators of the country.

Generate Results
An analysis of each event and the threat it poses requires the power of a large knowledge base working behind the scenes to be able to make predictions.

The final step of the methodology is to produce a visualization of the data so the analysts and the medical authorities can make use of it. HealthMap has a good front-end visualization of their results, that is shown in the figure below.

Figure. 2 Visualization of HealthMap’s Data

Conclusion
Considering the goals of a global disease surveillance system and the systems examined in this paper, several important conclusions can be reached. The current systems do not employ a methodology that can provide alerts with adequate time to contain the global spread of the disease (Weeks 2011). Their methodology involves monitoring a limited sources on the internet for keywords or phrases that could indicate a risk of a potential disease outbreak. However, a thorough semantic analysis of all content, that can look for factors to indicate where a disease outbreak might occur could make the system far more effective. This analysis combined with a knowledge base consisting of facts that would help the system process the data it gathers and rate it based on the threat it poses, is an indispensable component of any modern disease surveillance system. An advanced monitoring system that can use the knowledge base to make predictions can alert health authorities of a potential disease outbreak weeks before it occurs and thus save thousands of lives and billions of dollars.

Works Cited
Abbasi, Ahmed, Hsinchun Chen, and Arab Salem. 2008. ‘Sentiment Analysis in Multiple Languages: Feature Selection for Opinion Classification in Web Forums’. ACM Transactions on Information Systems. http://lcweb.senecac.on.ca:2134/10.1145/1370000/1361685/a12-abbasi.pdf? ip=142.204.1.85&acc=ACTIVE%20SERVICE&CFID=95479844&CFTOKEN=

34864361&__acm__=1333510662_0da55de99a 1cb0f3aaa2f528d6e2dd7f.

Blackwell, Tom. 2009. ‘Health Officials Aim to Hone Disease Surveillance’, May 3. http://www.nationalpost.com/news/canada/story.html?id=1562873.

Cyranoski, David. 2005. ‘Bird Flu Spreads Among Java’s Pigs’. Nature 435 (7041) (May 25): 390–391. doi:10.1038/435390a.

Flahault, Antoine and Patrick Zylberman. 2010. ‘Influenza Pandemics: Past, Present and Future Challenges’. BMC Infectious Diseases 10 (1): 162. doi:10.1186/1471-2334-10-162.

Freifeld, Clark C, Kenneth D Mandl, Ben Y Reis, and John S Brownstein. 2008. ‘HealthMap: Global Infectious Disease Monitoring Through Automated Classification and Visualization of Internet Media Reports’. Journal of the American Medical Informatics Association 15 (2) (January 3): 150–157. doi:10.1197/jamia.M2544.

Gilchrist, Mary J., Christina Greko, David B. Wallinga, George W. Beran, David G. Riley, and Peter S. Thorne. 2006. ‘The Potential Role of Concentrated Animal Feeding Operations in Infectious Disease Epidemics and Antibiotic Resistance’. Environmental Health Perspectives 115 (2) (November 14): 313–316. doi:10.1289/ehp.8837.

Ginsberg, Jeremy, Matthew H. Mohebbi, Rajan S. Patel, Lynnette Brammer, Mark S. Smolinski, and Larry Brilliant. 2008. ‘Detecting Influenza Epidemics Using Search Engine Query Data’. Nature 457 (7232) (November 19): 1012–1014. doi:10.1038/nature07634.

Government of Canada, Public Health Agency of Canada. 2004. ‘Global Public Health Intelligence Network (GPHIN) – Information – Public Health Agency of Canada’. http://www.phac-aspc.gc.ca/media/nr-rp/2004/2004_gphin-rmispbk-eng.php.

Harrison, Guy. 2012. ‘Sentiment Analysis Could Revolutionize Market Research’. Database Trends & Applications.

HealthMap System Architecture. Image.

2007. http://jamia.bmj.com/content/15/2/150.full#xref-ref-27-1 (accessed March 29, 2012).

Heffernan, Richard Farzad Mostashari, Debjani Das, Adam Karpati, Martin Kulldorff† and Don Weiss. ‘Syndromic Surveillance in Public Health Practice, New York City’. http://wwwnc.cdc.gov/eid/article/10/5/03-0646_article.htm.

Leetaru, Kalev H. 2011. ‘Culturomics 2.0: Forecasting Large–scale Human Behavior Using Global News Media Tone in Time and Space’. First Monday 16 (9) (August 17). http://www.uic.edu/htbin/cgiwrap/bin/ojs/index.php/fm/article/view/3663/3040.

Neill, Daniel B. 2012. ‘New Directions in Artificial Intelligence for Public Health Surveillance’. IEEE Intelligent Systems 27 (1) (January): 56–59. doi:10.1109/MIS.2012.18.

Piarroux, Renaud. 2011. ‘Understanding the Cholera Epidemic, Haiti’. Emerging Infectious Diseases 17 (7) (July): 1161–1168. doi:10.3201/eid1707.110059.

Recorded Future. 2011. ‘Big Data for the Future: Unlocking the Predictive Power of the Web’. Education October 18. http://www.slideshare.net/RecordedFuture/big-data-for-the-future- unlocking-the-predictive-power-of-the-web.

Subrahmanian, V.S., and Diego Reforgiato. 2008. ‘AVA: Adjective-Verb-Adverb Combinations for Sentiment Analysis’. IEEE Intelligent Systems 23 (4) (July): 43–50. doi:10.1109/MIS.2008.57.

Visualization of HealthMap’s Data. Image.

2012. http://www.healthmap.org (accessed April 03, 2012).

Weeks, Carly. 2011. ‘Social Media Could Help Detect Pandemics, MD Says’. The Globe and Mail. http://www.theglobeandmail.com/life/health/new-health/health-news/social- media-could-help-detect-pandemics-md-says/article2077719/.

Yu, Xiaohui1, Yang2 Liu, Xiangji3 Huang, and Aijun3 An. 2012. ‘Mining Online Reviews for Predicting Sales Performance: A Case Study in the Movie Domain’. IEEE Transactions on Knowledge & Data Engineering.

Zheng, Wanhong, Evangelos Milios, and Carolyn Watters. 2002. ‘Filtering for Medical News Items Using a Machine Learning Approach.’ Proceedings of the AMIA Symposium: 949– 953.

Posted by anurag

Category: Software

disease outbreak, disease surveillance, influenza, natural language processing, sars, semantic analysis, sentiment analysis

April 16th, 2012 at 9:31 PM

1 response

Plato’s Allegory of the Cave and the condition of Mankind

There is only one good, knowledge, and one evil, ignorance – Socrates [Yonge XIV]

In the Allegory of the cave, Plato describes the human condition that existed in the society at the time of the persecution of his mentor Socrates, for corrupting the minds of the youth [Durant 12]. However, what Plato was able to identify is a fundamental characteristic of mankind that exists even today in our society. Plato’s depiction of the cave is composed of two strata’s of human society that are relevant even today – the prisoners (the common man) and the puppet handlers (establishment). The prisoners are ignorant of the truth, because they accept what they see as reality and do not think beyond what they see. They see shadows dance on the cave wall in front of them their whole lives and think it is the only reality possible because they have been “in it from childhood with their legs and necks in bonds so that they are fixed [on the wall]” [Plato VII 514b]. The shadows are distortions of truth that are presented to them by the rulers that use them as a “throng of lies and deceptions for the benefit of the ruled” [Plato V 459d]. Similarly, the common man today is ignorant of the truth, because he has been been forced into narrow-mindedness since childhood by the establishment (puppet handlers). In the course of human history, very few men have broken these shackles of narrow mindedness and escaped from the cave by the use of reason to be enlightened by the truth and see the reality that we live in.

It wasn’t until two millennia after Socrates was persecuted for corrupting the minds of the youth with his reasoning, that Nicolaus Copernicus used logic and reason to question the foundations of Ptolemy’s model of the universe. For over 1500 years since Ptolemy, it was accepted that the Earth was stationary and the center of the universe, because it conformed with the established view of the church – that the Earth and our place in the universe was special [Ridpath 62]. However, Copernicus didn’t accept the perception of his senses as reality and thought about the shadows on the wall, just as the escaped prisoner in the Allegory of the cave did. Breaking away from conventional thinking, he realized that what Ptolemy perceived to be the motion of the Sun revolving around the Earth, wasn’t a real motion, but only apparent motion due to the rotation of the Earth. He realized, the Sun, not the Earth was the center of the universe and Earth went around the Sun just as other planets did [Ridpath 62]. Out of fear of persecution by the church, Copernicus delayed publishing his thesis for decades, finally publishing it the same year he died. In the preface he confessed to Pope Paul III:

I reckon easily enough, Most Holy Father, that as soon as certain people learn that in these books of mine in which I have written about the revolutions of the spheres of the world I attribute certain motions to the terrestrial globe, they will immediately shout to have me and my opinion hooted off the stage … when I weighed these things in my mind, the scorn which I had to fear on account of the newness and absurdity of my opinion almost drove me to abandon a work already undertaken (Hawking 8).

However, ultimately he was compelled to publish his thesis, just like the escaped prisoner returns to the cave to tell the other prisoners of the truth, in an attempt to make them realize they are in a cave. It didn’t receive much attention until seven decades later, when Galileo Galilie, with the advent of the telescope discovered that Saturn and Jupiter too had moons revolving around them, thus contradicting the church’s view that all heavenly bodies revolved around the Earth [Ridpath 80]. Galileo was ostracized for supporting Copernicus’s view of a heliocentric solar system and persecuted by the church when he tried to open the eyes of the people and show them the distinction between the illusions of the merely empirical and the realities of the heavens [Sagan 115].

Only a quarter millennia later the church got another rude awakening when Charles Darwin published his book ‘Origin of the Species’ in which he questioned the premise of the church’s assertion of creationism. While aboard the H.M.S Beagle in 1837, Darwin was surveying the wildlife in the Galapagos Islands when he saw that the same species existed on several isolated islands but with slight distinctions [Darwin 528-40]. Rather than taking what he saw at face value, he tried to look beyond what his senses perceived to be real and set forth on a journey to discover the truth. Similar to the experience of the prisoner going from the cave into the light, when his eyes take some time to adjust to the brightness as they are illuminated by the truth, it takes Darwin a few years before he is convinced of his ideas of natural selection. He reveals his scepticism in the theory of natural selection in Chapter VI of Origin of Species in the ‘Difficulties with the theory’:

To suppose that the eye, with all its inimitable contrivances for adjusting the focus to different distances, for admitting different amounts of light, and for the correction of spherical and chromatic aberration, could have been formed by natural selection, seems, I freely confess, absurd in the highest possible degree. (Darwin 227).

Darwin saw the truth because he realized what he was looking at were reflections and shadows of the truth, and he tried to unshackle himself from his own ignorance and rise above it to see the truth. In 1858, he published his findings in the ‘Origin of species’ which won him wide criticism and ridicule from the people, who, like other prisoners still in the cave saw the distorted shadows of truth and believed them to be real, and scoffed at Darwin for suggesting otherwise.

In many ways, the cave is our society in microcosm and while many people have become enlightened over the centuries, Copernicus, Galileo and Darwin are the epitome of Plato’s prisoner that escapes from the cave. While most of mankind is stuck in the cave confined by its own ignorance and self-content believing everything in front of them as reality, only a few individuals unbound themselves to escape the cave and explore the world we live in. They do not seek their knowledge from their senses or their experience, but by looking beyond their senses and using reason to find the underlying causes of things. However, they are often ridiculed for their perceptions of reality when they convey them to the people, who are too familiar with their own version of reality.

Works Cited
1) Darwin, Charles. The Origin of Species.
New York: Random House, 1998.
2) Durant, Will. The Story of Philosophy.
New York: Simon and Schuster, 2006.
3) Hawking, Stephen. On the Shoulders of Giants.
Philadelphia: Running Press, 2002.
4) Plato. Translated by C.D.C Reeve. The Republic.
Indianapolis: Hackett Publishing, 2004
5) Ridpath, Ian. Book of the Universe.
London: Parkgate Books, 1991.
6) Sagan, Carl. The Cosmos.
New York: Random House, 1985.
7) Yonge, C.D. The Lives and Opinion of Eminent Philosophers. Nov. 25, 2008.
http://fxylib.znufe.edu.cn/wgfljd/%E5%8F%A4%E5%85%B8%E4%BF%AE%E8%BE%9E%E5%AD%A6/pw/diogenes/dlsocrates.htm#cite

Posted by anurag

Category: Philosophy

allegory of the cave, beagle, church, copernicus, darwin, establishment, galileo, hawking, jupiter, origin of species, plato, pope, retogade motion, saturn, socrates

March 16th, 2012 at 8:26 PM

1 response

Divide and Rule in British Raj

Human history has witnessed the rise and fall of several empires over the course of last four millennia. These empires grew for the benefit of a small population, at the expense of another large, often foreign, population. While the way in which nations imperialize others has transformed over the centuries, the techniques by which they dominate and maintain control over others has only improved. One of the finest examples of this in recent history is under the British Raj in India. The partition of the Indian sub-continent stemmed not from the demographic makeup of the region, but from the divide and rule policy employed by the British Crown.

For centuries, the Indian subcontinent was home to people from several different ethnicities and religious groups, all the while tolerant of each other and minorities. This is evident in the fact that India is today home to a very diverse population. From Zoroastrians seeking refuge from Iran, to Buddhists, Sikhs, Muslims and Christians that assimilated into the Hindus majority, the sub-continent provided everyone with a sanctuary from religious persecution (Hasan 25). Although the differences between the religion and customs of these different communities were stark, they never came to the brink of partition of a state, until rule under the British crown (Hasan 26).

Ironically, the first war for independence of unified India, in 1857, started a chain of events that precipitated in the partition of the nation. For over 100 years East India Company had monopolized the economy of the sub-continent and to ensure its interests were met, it maintained an army comprising of Indians. This standing army was used to wage war against those who did not open their markets to Britain, and intimidate others (Guha 86). In 1857, dubious claims that the cartridges used in this army were made of animal fat, offensive to both Hindus and Muslims, triggered a mutiny among the lower ranks of the army. The different communities living in India were already agitated by the attempts made by East India Company to reform social norms, and given this mutiny, they took the opportunity to launch a massive revolt against the British rule (Habib 8). Such a revolt against a foreign rule was new in India and demonstrated the feeling of unity among the people. However, as so often in history, the consequences of this revolt were the opposite of those intended. Rather than reducing the British control overIndia, it was formally thrust into the hands of the British crown.

At this time, Britain was engaged in several trade wars to maintain its dominance over the global markets, and the control ofIndiaas a colony and a market for its goods became central to its geo-political strategy. As an established global power, Britain required a large well funded naval fleet and armed forces to be able to ensure its interests around the world were represented. Britain had already gone to war with China and brought to an end the Chinese controlled Canton system of trade in 1842 (Fairbank 355). However, with the rise of other economic powers, Britain felt a greater need to assert itself and increase its area of hegemony. In 1854, the Kanagawa treaty was forced between US and Japanby Commodore Matthew Perry that opened up two of Japan’s ports to the US (Lubar 25). The growing imperialist ambitions of other powers meant that it was imperative for Britain to maintain a colony, which can sustain its large scale wars to protect its trade monopoly. Its first response to the Kanagawa treaty was to re-engage with China in 1856 in the second Opium war (Haq 117). Then, in 1857, it took over the control of India from the East India Company, under the pretence of including India in its Empire, to develop it as a market for its own products.

As part of Britain’s grand scheme to make India a permanent colony, it had made huge investments to develop India’s infrastructure, and it was imperative that it be able to squelch any opposition that threatened its supremacy. To be able to exploit India’s resources and to maintain an easy flow of goods from India’s hinterland to its ports, Britain had funded large scale projects such as railways, roads, canals and bridges. It also established telegraph links to be able to administer its colony efficiently. This large scale infrastructure enabled Britainto use it to transport goods from India to Britain, and also bring goods from Britain for the retail markets in India (Stein). While this burgeoning economy was great for Britain, it had a severe impact on India and its poor who were dependant on the local industry for their livelihood. However, the value of the infrastructure and the trade with India to Britain was so immense, that Britain realized it needed to be able to maintain India as a permanent colony and it began taking steps to consolidate it.

In order to maintain control overIndia, Britain’s weapon of choice was the divide and rule policy, and its first significant use was to conduct the census of 1872. As a global power Britain understood the importance of maintaining civil order and knew one of the most historically successful techniques to weaken any opposition was divide and rule. The objective is to foster an environment of mistrust among the local population, to distract them from the real enemy and ensure they will not be united. As the first step towards this goal, Britain commissioned a census of the entire population in India, the first of its kind in modern history, to learn about the social composition of its different regions. The census took over six years to compile, but the results that detailed the demographics of India based on religion, caste and occupation, provided Britain with a recipe for creating communal disharmony (Census of Bengal).

By the end of the century, British rule was facing growing opposition among the elite classes in India, and it responded by employing the divide and rule policy in Bengal to exasperate the tensions between the local populations. The socio-economic policies of Britain had been attracting the ire of those Indians that had studied in Britain and returned toIndia. They were starting to realize the hypocrisy of the British imperialism in India and formed political organizations to defend their political rights. One such organization was the Indian National Congress, formed in 1885, that remained at the forefront of the Indian independence movement. In Bengal, major agitations were organized by groups to demand greater participation of Indians in their own governance (Bose). This resulted in Britain’s second significant step to employ divide and rule in India– the partition of Bengal in 1905. This was significant in their political objectives because they partitioned Bengal along religious lines – wealthy Hindu land owners in the West lost their lands to Muslims, to whom the land had been leased in the East. This generated feelings of animosity and mistrust between the two religious groups and resulted in large scale riots against the partition. However, Britain’s objectives for the time-being had been addressed as the unified movement demanding more political rights in Bengal lost its momentum.

With the seeds of communal disharmony in place, they began showing immediate results for Britain as other political organizations also split along religious lines. One of the direct results of Britain’s moves to partition Bengal and undercut their opposition, was to give rise to a new political party in Indian politics – one exclusively for Muslims. The All India Muslim League formed in 1906 by elite Muslims of India with an agenda to unite for the defense of Muslims rights across the nation. The League adopted the ideas written in The Green Book by Mohammed Ali Jinnah, one of the founding members, that contained ideas on how to defend the rights and liberties of Muslims (Jalal). This in turn provoked the Hindus to form their own political party called the Hindu Mahasabha in 1915. The Mahasabha was formed with the objective of protecting the rights and liberties of Hindus across India. Both the parties became major political parties by 1920 and were openly critical of the secular Congress party. This was a great victory for the divide and rule policy of Britain as it had already been ruling and consolidating its control overIndia for 60 years, and still it did not face any serious unified opposition to its rule.

The divide and rule policy was starting to bear fruit for Britain and poison India’s communal harmony, as the even through two major World wars, Britain was able to sustain its rule over India. The rise of the communal political parties had already engulfed Indian politics with rhetoric of hate and anger. This infighting between the parties for their own objectives allowed Britain to exploit India even further for another three decades. Even as negotiations for an independent India began in 1931 at the Round Table Conferences in London, the Indian population remained split on what should be the outlook of independent India. The Lahore resolution adopted at the Muslim League in 1940 cemented the idea of a two-nation theory. The resolution called for greater Muslim autonomy under British rule, which ultimately lead to the creation of West Pakistan and East Pakistan.

The face of the Indian sub-continent and the state of its divisive local politics today owes much debt to the divide and rule policy of the Great Britain. The policy was a huge success for Britain as it gave the crown the power to rule and exploit India as a market for 90 years, at the cost of the political rights of the local population. Even through the decade of 1850 when Britain faced stiff competition from other imperial powers like the United States, and again in 1890’s with the rise of Japan, after the Meiji restoration, Britain was able to sustain its economy. The divide and rule policy that was just an imperial tool for Britain, sealed the destiny of the Indian sub-continent for the next several centuries to come. Even as divide and rule policy was just beginning to take effect in 1870’s, it set in motion a chain of events that would ultimately precipitate into the partition of the sub-continent.

Works Cited

1) Bose, Sugata, and Ayesha Jalal. Modern South Asia: History, Culture, Political Economy. New York: Routledge, 2009. Print.

2) Census ofBengal, 1881

Journal of the Statistical Society of London , Vol. 46, No. 4 (Dec., 1883), pp. 680-690

Published by: Blackwell Publishing for the Royal Statistical Society

Article Stable URL: http://www.jstor.org/stable/2979312

3) The Crucial Years of Early Anglo-Chinese Relations, 1750-1800 by Earl H. Pritchard

Review by: T. K. Fairbank

Journal of the American Oriental Society , Vol. 57, No. 3 (Sep., 1937), pp. 353-357

Published by: American Oriental Society

Article Stable URL: http://www.jstor.org/stable/594601

4) The Coming of 1857

Irfan Habib

Social Scientist , Vol. 26, No. 1/4 (Jan. – Apr., 1998), pp. 6-15

Published by: Social Scientist

Article Stable URL: http://www.jstor.org/stable/3517577

5) Modern China and Opium: A Reader by Alan Baumler

Review by: M. Emdad-ul Haq

Pacific Affairs , Vol. 76, No. 1 (Spring, 2003), pp. 116-118

Published by: Pacific Affairs,UniversityofBritish Columbia

Article Stable URL: http://www.jstor.org/stable/40024006

6) Communalism and Communal Violence inIndia

Zoya Khaliq Hasan

Social Scientist , Vol. 10, No. 2 (Feb., 1982), pp. 25-39

Published by: Social Scientist

Article Stable URL: http://www.jstor.org/stable/3516974

7) A Conquest Foretold

Ranajit Guha

Social Text , No. 54 (Spring, 1998), pp. 85-99

Published by: Duke University Press

Article Stable URL: http://www.jstor.org/stable/466751

8 ) Jalal, Ayesha. The Sole Spokesman: Jinnah, the Muslim League, and the Demand for Pakistan. Cambridge [Cambridgeshire: Cambridge UP, 1994. Print.

9) In the Footsteps of Perry: The Smithsonian Goes toJapan

Steven Lubar

The Public Historian , Vol. 17, No. 3 (Summer, 1995), pp. 25-59

Published by:UniversityofCalifornia Press on behalf of the National Council on Public History

Article Stable URL: http://www.jstor.org/stable/3378751

10) Stein, Burton. A History of India. New Delhi: Oxford University, 2001. Print

Posted by anurag

Category: History

1857 revolt, 1947, british crown, british raj, canton system, china, Commodore perry, divide and rule, hindu mahasabha, india, indian national congress, jinnah, opium war, partition of india, subhash chandra bose

February 20th, 2012 at 8:40 PM

2 responses

Processing.js Bug 1606 – Review needs work update

Bug: Link here
Commit: Link here

At the last checkpoint for this bug, my code had been reviewed by John Buckley and he had suggested some changes.

Here’s the update:

/* IE9+ Compatibility mode fix  - Bug 1606*/  
	
if (document.documentMode >= 9 && !document.doctype) {
  throw("DocType directive is missing. The recommended DocType in IE 9 is the HTML 5 DocType: <!DOCTYPE html>");
}

Tested it on IE9 to make sure it works, results below:

Posted by anurag

Category: Open Source

December 14th, 2011 at 3:14 PM

No response

Update: Processing.js bugs – review needs work

Just as I was beginning to feel exceptionally lucky, my processing.js bug patches were NOT accepted in the first iteration 🙁

Check for IE9 non-HTML5 mode
filter(BLUR) issue

Their status is now changed to “review-needs-work”. And work I shall…with the benefit of comments from John Buckley I am sure it won’t take too long. Except… being only two weeks away from finals for this semester, time is of the essence, and I’m afraid work on them will be put on the back-burner for now.

I did sign up for the second part of the Open Source project though, so I’m sure I’ll get through these bugs, and adopt some more in due time 🙂

Posted by anurag

Category: Open Source

December 2nd, 2011 at 6:03 PM

No response

Running the Mozilla Mouse lock tests

Having written a first draft of the Mozilla mouse lock tests, I went ahead to put it all together and run them on my nightly build.

I created a new test branch on my github for the tests I had written, and added the two new files in the conventional mochi test format:

    <pre id="test">
        <script type="application/javascript">

            /** Test for Bug 633602 **/
            /** Test to see if navigator.pointer object is of MouseLockable type **/
            SimpleTest.waitForExplicitFinish()
            SimpleTest.waitForFocus(function() {
              var c = navigator.pointer;
			  var IsSameType = c instanceof MouseLockable;
			  is(IsSameType, true, "Error message");
              SimpleTest.finish();
            });
        </script>
    </pre>

I added the two tests in the following directory: mozilla-central\dom\tests\mochitest\mouselock

Edited the Makefile.in to include my tests:

_TEST_FILES	= \
    test_isInstanceofMouselockable.html \
    test_mouseLockableHasRequiredMethods.html \

And I was ready to give them a test after merging in all the final changes from humphd.

I did a rebuild of the firefox in my new test branch after resolving merge conflicts and ran the tests:

TEST_PATH=dom/tests/mochitest/mouselock/test_isInstanceofMouselockable.html make -C ffobjdir mochitest-plain
TEST_PATH=dom/tests/mochitest/mouselock/test_mouseLockableHasRequiredMethods.html make -C ffobjdir mochitest-plain

Results are below. Final step was to send a pull request to the Mozilla MouseLock test module owner Raymond here

Passed is instance of MouseLockable test

Posted by anurag

Category: Open Source

c++, firefox, git, github, makefile, mochi test, mochitest, mouse, mouselock, mouselockable, mozilla

December 2nd, 2011 at 5:48 PM

No response

MouseLockable Mochi Tests – First draft of tests

Hurrah! We (mostly Prof. Humphrey) implemented Mouse Lock in Firefox! Following the specs given by W3C, we implemented Mouse Lock that will be shipped in the new version of firefox in the coming few weeks (I believe by Feb 2012).

To get a grasp what we accomplished, check out the first demos (ever, on firefox) of the mouse lock implementation:

Mouse Lock Demos

The journey lasted just a couple of weeks, but the amount of new concepts and ideas I was introduced to is amazing. This experience was the first c++ project I was a part of in a non-c++ course (of which I have had only 2). The techniques of Mozilla development, like translating specs into IDL, then coding the classes, using reference tools like MXR and DXR, and debugging in a new environment have been thrilling to learn and participate in. Of-course all of this was in concert with interacting on irc with the open source community, becoming more familiar with git and the make tool, and most of all, going through the actual process of problem solving, solo and in a team.

However, for all the work already done, about 80% of it, we have another 90% left. If the math doesn’t add up, you can take it with up Prof. Humphrey who I have quoted here.

Now the next step of writing tests begins.

I have written 3 Mochi tests for MouseLockable:

navigator.pointer (readonly) is a MouseLockable
MouseLockable has lock(), unlock(), islocked()
“The unlock method cancels the mouse lock state”

Below are the first drafts of the actual js code to test each of them, that I will refactor over the next day and push off to the MouseLockable tests module owner Raymond.

All of them passed when I tried them on Mochi Test Maker on my nightly build, after merging in the latest commits last night.

Test 1: Test to see if navigator.pointer object is of MouseLockable typ

<script class="testbody" type="text/javascript">

/** Test for Bug 633602 **/
/** Test to see if navigator.pointer object is of MouseLockable type **/

var c = navigator.pointer;
var IsSameType = c instanceof MouseLockable;
is(IsSameType, true, "Error message");

</script>

Test 2: Test to see if MouseLockable has lock(), unlock(), islocked()

<script class="testbody" type="text/javascript">

/** Test for Bug 633602 **/
/** Test to see if MouseLockable has lock(), unlock(), islocked() **/

var hasProperty1 = MouseLockable.prototype.hasOwnProperty('lock');
var hasProperty2 = MouseLockable.prototype.hasOwnProperty('islocked');
var hasProperty3 = MouseLockable.prototype.hasOwnProperty('unlock');
var allPassed = false;

if (hasProperty1 && hasProperty2 && hasProperty3) { allPassed = true; }

is(allPassed, true, "Error message");

</script>

Test 3: Test to see if unlock() will cancel the lock() state in MouseLockable

<script class="testbody" type="text/javascript">

/** Test for Bug 633602 **/
/** Test to see if unlock() will cancel the lock() state in MouseLockable **/

var np = navigator.pointer;
var locked = np.islocked();
np.lock = np.lock();
var locked2 = np.islocked();
np.unlock();
var locked3 = np.islocked();

var allPassed = false;

if (!(locked) && (locked2) && !(locked3) ) { allPassed = true; }

is(allPassed, true, "Error message");

</script>

Posted by anurag

Category: Open Source

December 1st, 2011 at 8:03 AM

3 responses

Escape key event in Firefox to exit MouseLock

I’ve been working on another task in the MouseLock implementation: When ESC key is pressed, mouse lock should exit.

I’ve been going through mxr, dxr and comments in other mozilla bugs, and came up with the following code that we might need to check for the ESC key press:

nsEvent* aEvent;
  const nsKeyEvent* keyEvent = static_cast<const nsKeyEvent*>(aEvent);
  int key = keyEvent->keyCode ? keyEvent->keyCode : keyEvent->charCode;
  
  if (key == NS_VK_ESCAPE) {  
	  fprintf(stderr, "Escape key is pressed!");
	  mIsLocked = PR_FALSE;
  }

However, it doesn’t work for now.

It builds if I insert it in the lock method for now, but I am not sure how to make it work. How will the lock/unlock method know to listen for this event? I am not sure how to implement that in c++ yet…will continue working on that.

Also, to those who need help setting up Visual Studio as their FF debugger on Windows: and https://developer.mozilla.org/en/Debugging_Mozilla_on_Windows_FAQ

—————————

Update:

Just found this piece of code:

// if we can use the keyboard (eg Ctrl+L or Ctrl+E) to open the toolbars, we
// should provide a way to collapse them too.
if (aEvent.keyCode == aEvent.DOM_VK_ESCAPE) {
   FullScreen._shouldAnimate = false;
   FullScreen.mouseoverToggle(false, true);
}

http://mxr.mozilla.org/mozilla-central/source/browser/base/content/browser.js#4027

Maybe we should also have a variable like _shouldMouseLock. On pressing ESC we can set it to false, and it could trigger the unlock() method. We could use the same variable to trigger unlock when the browser/window/tab loses focus.

Posted by anurag

Category: Open Source

November 20th, 2011 at 3:36 AM

2 responses

Progress made on processing.js blur filter bug

It’s been a while since I reported my progress on the blur filter bug assigned to me. I made some progress last week by isolating the shifting caused in the canvas to these lines of code, using breakpoints through firebug.

After playing around for a little bit to determine the magnitude of the shift caused, I saw when I subtracted 200 from the yi and ymi values, the shifting seemed to stop. In this case, my canvas size had been (200,200). I tried changing the canvas size and the value I had to subtract from the variables to seemingly remove any shifting, and it was exactly the value of the width of the canvas. I also tried changing the height of the canvas, to not be the same as the width, and as long as I subtracted the width from those variables, the shifting seemed to be gone.

Next, I needed to find out how to refer to the width of the canvas in the code. I searched the entire processing.js file for variables containing ‘width’ and ‘canvas’, but didn’t find the correct reference. After sometime I dropped by the #processing.js channel on irc, and got my answer from pomax. I adjusted the code in the processing.js file as follows:


        // subtract canvas width to adjust for shifting canvas (bug #1392)
        yi += aImgWidth-p.externals.canvas.width;
        ymi += aImgWidth-p.externals.canvas.width;
        ym++;
      }

I ran several tests in processing js helper using different canvas values. When I was confident I had adjusted for the shifting, I started to research how to write ref tests, as pomax had suggested this should be my next step before I request a peer review.

I wound up at this page on how to write ref tests.

Step 1 – I used the following sketch to build the ref test using the generator (located in: processing-js / test / ref / ref-test-builder.html):

void setup ()
{
    size( 100, 100 );
}

void draw ()
{
    for ( int i = 0; i < 5; i++ )
    {
        fill(200);
        ellipse(25,30,35,40);
    }
    filter( BLUR );
    exit();
}

Step 2 – I put the generated code in blurFilter.pde file and in the processing-js / test / ref / directory.
Step 3 – I edited tests.js in the same directory to include the test file I just created as the first test and tagged it under ‘2D’.

var tests = [
  { path: "blurFilter.pde", tags: ["2D"] },
  { path: "stretch.pde", tags: ["3D"] },
  { path: "arc-fill-crisp.pde", tags: ["2D"], epsilonOverride: 0.07 },

Step 4 – I ran index.html, selected 2D tests only, and hit start.
Step 5 – Voila! My test passed!

I also ran the test again after reverting the changes I made to adjust the shifting canvas, to make sure the test did fail before my changes. Sure enough, they did, and you can clearly see the direction in which the canvas shifted in the red pixels marking the offset in the failed test.

Committed the code here, and the final commit after all changes here. Will be requesting peer review on lighthouse next.

Screen-shots of failed and passed tests attached below:

Mine if the first test in the screenshot, check out the red mark in the failed test screenshot showing how the canvas had been shifting before.

Posted by anurag

Category: Open Source

November 19th, 2011 at 11:28 PM

1 response

Taking on a task in the Mouse Lock implementation

Now that I am getting a handle on the code (limited to the MouseLockable class) and encouraged by the progress made by Diogo and Raymond in implementing some of the tasks required for the Mouse Lock implementation, I decided to jump right in as well.

One of the requirements as per the specs is that the mouse cursor not be displayed while the mouse lock is on. So clearly, this method will be called from within the lock() method which we’ve got a start on. Next idea that immediately comes to mind is to somehow use the code that implements the CSS property of ‘cursor:none’ to achieve the same goal within the window that has mouse lock enabled.

Some of my colleagues have also started on the same road and I thought I would join in and discover with them.

The first breakthrough was to stumble upon this: Bug 346690. It is the bug that implements the Cursor:None property of CSS3 in firefox. Going through the discussion in the comments of that thread gave me various clues like ‘OnHideCursor()’ method and where some of the code relating to this implementation was saved. For windows, it is found in the file: mozilla-central/widget/src/windows/nsWindow.cpp.

I found an interesting part of the code in this file here:

case eCursor_none:
      newCursor = ::LoadCursor(nsToolkit::mDllInstance, MAKEINTRESOURCE(IDC_NONE));
      break;

It seems to be the code where the properties for the cursor are set in c++. So the next step is to study it further, discuss on irc and use mxr and dxr to find more clues as to how the CSS3 cursor none property was implemented.

Posted by anurag

Category: Open Source

November 19th, 2011 at 2:33 AM

2 responses

Anurag Bhatnagar's Blog

Predicting Disease Outbreaks Using Semantic Analysis

Plato’s Allegory of the Cave and the condition of Mankind

Divide and Rule in British Raj

Processing.js Bug 1606 – Review needs work update

Update: Processing.js bugs – review needs work

Running the Mozilla Mouse lock tests

MouseLockable Mochi Tests – First draft of tests

Escape key event in Firefox to exit MouseLock

Progress made on processing.js blur filter bug

Taking on a task in the Mouse Lock implementation

Introduction

Archives

Categories

Recent Posts

Blogroll