Data Protection Act

Data Protection Act (1988)
The following blog will outline the importance of the Data Protection Act and discuss some contemporary issues in an online context.

Act & Commissioner
The Data Protection Act (DPA) is an important piece of legislation aimed at safeguarding the rights of an individual’s personal data. If you collect, store or process any data on living people, you are bound to comply with the Act. The DPA follows eight principles or rules, listed below, which relate to any type of data.

Eight Principles of Data Protection

Eight Principles of Data Protection

Data Controller, Subject & Processor

Data Protection Act Definitions

Data Protection Act definitions

The Data Commissioner is responsible for dealing with complaints and ensuring that legal rights are upheld. Helen Dixon was appointed the Data Commissioner in 2014 and is responsible for not only Irish individuals but data from companies based in the State which include; Facebook, Google and Twitter. The department has been under increased pressure from Europe as it has been viewed as a ‘light touch’ when it comes to data protection regulation. There is also concern that the larger tech multinationals based in Ireland are using the country as a ‘one stop shop’ for European data collection (Weckler, 2015). In the background, the formation of the new European Data Protection Board will bring a pan-European regulator.

EU vs. US Safe Harbour Agreement (2000)
Data protection in the digital age is borderless and this has created much ambiguity when dealing with multinational data. The European Union (EU) and the United States (US) are the powerhouses when it comes to mass data. Europe is seen as social democratic while the US a more liberal model of government. Each have their own privacy laws and there is a need to balance data privacy with business and innovation to flow freely.
The EU and US established to the Safe Harbour Agreement in 2000 for US companies to agree with EU Directives. The agreement has been describes as shaky and the revelations of US surveillance practices caused a rift between the States. Recent negotiations maintained the agreement but the diverging cultural and legal discretions are making the long term effectiveness of the Agreement unlikely. Both States will have to look for a revised agreement to keep up with technological and social norms of data protection (Peltz-Steele, 2015).

Cookies
Cookies are small pieces of data sent from a website from a user’s web browser. Companies use this data to gain insights into the way people interact with websites. This is not only personal data willingly given but also information about user habits gathered through identifiers on web sites. Companies want to know user’s data to target advertising and websites to improve the user’s online experience. This gathering of data has become more sophisticated and user information is now even a commodity to be sold. There has been a push to regulate this practice and for governments to set stricter rules for consumer protection. There have been a number of cases taken against companies’ use of cookie. Adobe Flash Player is a high profile example of surreptitiously collecting data from users. The EU introduces the ePrivacy Directive (2002) which lays down principles to the use of cookies essentially for legitimate purposes and with the user’s knowledge. There is still ambiguity over the collection and storage method and each Member State still has discretion over the interpretation of the directive (Lanois, 2011).

Cloud
The Cloud is a network of remote servers to store, manage and process data. This poses an even bigger threat to traditional desktop computing as all data is stored online. The identified risk relate to security risks and people’s identity. This put additional pressure on Cloud vendors to increase their security. Still the issue of jurisdiction remains and data may not be secure in some countries with less vigorous security measures. Cloud computing is on the rise as a popular, cheap and convenient form of data storage for individuals and businesses. Moving data outside the EU can be considered a breach of the EU Data Protection Directive (Lanois, 2011). Is feasible for multinational organisations to segregate and store such data?

Breaches
There have been a number of high profile data breaches which gained widespread public attention. Data breaches are not a new phenomenon but the increase in digital data has the potential to impact millions of people. The Data Protection Commissioner publishes records of data breaches by year and category. The largest data breach recorded in Ireland was by Loyaltybuild. Personal details and financial data of about 1.5 million customers was hacked described as a ‘sophisticated criminal act’. Though serious, this pales in comparison to some of the worldwide security breaches.

History of Data Breaches

Digital Guardian: The History of Data Breaches (2015)

Conclusion
The Data Protection Act is an important piece of legislation for data collection. There are clear guidelines to follow and the Data Controller is responsible for the implementation of the Act. The EU and US established the Safe Harbour Agreement in 2000 to govern data in the two jurisdictions. Some contemporary issues relating to data protection include Cookies and Cloud computing. Cookies are useful for advertising and improving online experiences but security issues remain. Cloud computing means holding data on cloud servers with security risks a prevalent concern. Data Breaches are a common occurrence in the digital age and concerns over security are impacting millions worldwide.

References
• Data Protection Act (1988 & 2003) ‘A Guide for Data Controllers’, Available at: https://www.dataprotection.ie/documents/forms/NewAGuideForDataControllers.pdf (Accessed on: 31 August 2015)
• Data Protection Commissioner (2015) ‘Data Protection’, Available at: http://www.dataprotection.ie/viewdoc.asp?DocID=4 (Accessed on: 31 August 2015)
• Weckler, A. (2015) ‘Irish Independent – Ireland’s New Data Chief – Forget About the Light Touch’, Available at: http://www.independent.ie/business/technology/news/irelands-new-data-chief-forget-about-the-light-touch-31182694.html (Accessed on: 01 September 2015)
• Peltz-Steele, R.J. (2015) ‘THE POND BETWIXT: DIFFERENCES IN THE US-EU DATA PROTECTION/SAFE HARBOR NEGOTIATION’,Journal Of Internet Law, 19, 1, pp. 1-15, Business Source Complete
• Lanois, P. (2011) ‘Privacy in the Age of the Cloud’, Journal of Internet Law, 15, 6, pp. 3-17, Business Source Complete
• Digital Guardian (2015) ‘The History of Data Breaches’, Available at: https://digitalguardian.com/blog/history-data-breaches Accessed on: 11 September 2015

Data Security (Internet of Things)

Data Security – ‘Internet of Things’ – be careful what you wish for….
We are all aware of the collection of data in our everyday lives. Individuals want to consume a product or service and the provider’s gain a value exchange for this personal data. This is all governed by the Data Protection Act 1998 and amended in 2003. There are clear guidelines for the gathering, storage and usage of data and companies face prosecution for failure to comply.

There has been a shift in direction for data collection and this is known as the Internet of Things (IOTs).

“We define the Internet of Things as sensors and actuators connected by networks to computing systems. These systems can monitor or manage the health and actions of connected objects and machines. Connected sensors can also monitor the natural world, people, and animals.”
(McKinsey Report, 2015)

Essentially machines are collecting and correlating data to improve our experiences. This has positive implications for improved health data, smart home appliances, automated cars and even having umbrellas stocked when rain is forecast or a cold drink on a hot day. So when thinking of our data we must consider how the IOTs will impact on data security.

Innovation for good
There is no doubt the potential benefits of IOTs is extraordinary. The McKinsey report highlights several areas that the IOTs will have a realistic of creating value in the next decade. IOTs not only creates employment in research and projects but for business to emerge and improve with the data output.

IOTs McKinsey diagram

McKinsey Report (2015): IOTs

The monitoring of health is an area where the obtaining of data could improve health and save people’s lives. Smart watches could measure and track people and not only provide data for chronic illnesses but detect a stroke or heart attack before it even happens.
Smart cities are becoming more and more likely with research on improving transportation flow with autonomous cars, power usage from metering and water and air quality with sensors. This not only will produce saving for city authorities but improve life for people in the city too.

But what are the costs?
There is no doubt that the potential benefits from IOTs could transform the way we do things in the future. But there is one big question that needs to be addressed, how do we keep data secure?

This is a consideration that needs a lot of thought as once machines are collecting data we need to focus on another dimension of regulation.
An article on data security by the Guardian (2015), demonstrated some areas where the IOTs may have serious concerns for data security. Regulation can become a problem for example in the use of Smart Maters reducing energy bills. The data may fall under several regulator jurisdictions including; energy regulator, broadband regulator and the data may even be outside of the country.
Security is another big risk and the potential for the data to be hacked is a serious concern. Data may be hacked for say a person’s pacemaker or a terrorist organisation hacking a vehicle.

Conclusion
The potential for the IOTs to shape our future, using data, is a positive step forward. Smart devices will emerge to improve people’s lives and society in general. The McKinsey report highlighted a number of areas that will create value in the next ten years. The potential threat for IOTs data to be used negatively is a grave concern. It is important for the regulator to understand these technological changes and reflect these in Acts to protect consumer’s data. It is important to regulate the access and control of the data and prosecute any breaches that will occur.

References:

Big Data, Better Data?

Big Data Analysis

Businesses are experiencing of tsunami of data and a paradigm shift when it comes to how we make it meaningful. In 1965, Gordon Moore concluded that computing would increase in power and decrease in costs at an exponential rate. Moore’s Law has held up since that time. So as we move towards big data is Moore’s Law still valid?

We see the evolution of data management (see figure below) and the move from enterprise resource planning, customer relationship management, web based and finally big data. Data is now too big in size and unstructured to justify the use of traditional relational database designs.

 

big-data-evolution

Source: http://image.slidesharecdn.com/bigdata-140128092341-phpapp02/95/big-data-6-638.jpg?cb=1390901096

 

Though traditional database management and business intelligence systems are still in use (e.g. SQL), the tide is moving to the challenges of storing and manipulating big data for management. The biggest winner by far has been ‘Hadoop’ the open source software framework managed by the Apache Software Foundation. Businesses are currently starting to see how they can utilise and make decisions based on big data tools. Some examples of big data in action today are Spotify recommending playlists, Facebook suggesting friends and Netflix picking your next movie or box set.

Asking the right questions

Big data nails down to one thing when looking for an outcome – asking the right question!

The example from Douglas Adam’s ‘Hitchhiker’s Guide to the Galaxy’ is an apt parable for Big Data. A supercomputer analyses for hundreds of years to find the meaning of life, the universe and everything. The computer calculates the answer to be ‘42’. After protest, it is routinely explained that now they have the answer, they need to find the actual question – which requires a more sophisticated computer.

All data is meaningless without the skill to analyse and yield results. The most successful companies are making decisions based on facts and information. A business must create a strategy and be clear about what information it needs to achieve set goals. An example would be where a company wants to increase customers. Some good questions to ask would be ‘who are our current customers’ and what are our demographics for valued customers’ – this makes it easier to identify big data that can be gathered (Marr, 2015)

 

Sources of Big Data

Main Sources of Big Data

The real problem with big data is not the functions of storage or analysis – it is transforming the relevant data into useful information. This is not a new phenomenon; making data relevant has been an issue with all data sources not only big data. Trying to architect and design big data will create a competitive advantage for organisations.

Volume, Variety, Velocity & (Veracity)

When analysing the dimensions of what characterises big data prevailing, theory outlines three distinct elements known as the 3 V’s – Volume (scale or size), Variety (sources) and Velocity (motion). There have been numerous studies identifying a number of additional elements (see appendix below – all beginning with V!) but the seminal work from Douglas Laney is still relevant. The 3Vs must be taken into consideration when designing an organisations Business Intelligence model. However, I would argue the inclusion of a fourth V when characterising big data, namely Veracity.

Veracity is an important feature of big data as essentially no matter how accurate data seems, there will always be an inherent uncertainty. Examples include; weather patterns, human sentiment, economic factors and future trends (IBM Report, 2012). No amount of data cleansing will make this data fully accurate. It is important to gain information, analyse and forecast using this ‘uncertain data’ and still create valuable information. An organisation must factor in the Four V’s to create a competitive advantage from a big data strategic plan.

Big Data challenges

The main challenges coming from the characteristics of big data include (Jagadish et al., 2014);

  • Heterogeneity – the structure of the data must be interpreted and metadata required
  • Scale – size of data is bigger than hardware capabilities parallelism (nodes) and cloud computing
  • Inconsistency and incompleteness – diverse sources and errors need to be identified and corrected/mitigated
  • Timelines – real time techniques to filter and summarise data
  • Privacy and data ownership – laws to consider but also a philosophical argument on who ‘owns’ personal data.

These are the more technical challenges faced at the moment when looking at big data. There are wider issues such as economic, social and political, which need to be addressed on an international level. We have seen the fallout from the NSA’s collection of data and the impact it has on people. Big data and especially personal information will become more freely available and this has privacy and security implications. Organisations will have to face a number of not only technical, but moral hazards in the future.

Conclusion – Future Direction

It is no longer applicable to handle increasing volumes of data, as in the past. Big Data has caused a fundamental shift and CPU speeds and other resources are not able to manage these data volumes. Moore’s Law is being seriously challenged but is, so far, holding up in the face of big data. If big data continues to proliferate, we may need to examine the validity of Moore’s Law in the future.

 

 

 

 

References

  • Marr, B. (2015) ‘Forbes – Big Data: Too Many Answers, Not Enough Questions’ Available at: http://www.forbes.com/sites/bernardmarr/2015/08/25/big-data-too-many-answers-not-enough-questions/2/ (Accessed on 26 August 2015)
  • IBM Report (2012) ‘Analytics: The real-world use of big data How innovative enterprises extract value from uncertain data’ Available at: http://www03.ibm.com/systems/hu/resources/the_real_word_use_of_big_data.pdf (Accessed on 29 August 2015)
  • Laney, D. (2001) ‘3D data management: Controlling data volume, velocity, and variety, Application Delivery Strategies’, META Group.
  • Moorthy, J., Lahiri, R., Biswas, N., Sanyal, D., Ranjan, J., Nanath, K., & Ghosh, P. (2015) ‘Big Data: Prospects and Challenges’, Vikalpa: The Journal For Decision Makers, 40, 1, pp. 74-96, Business Source Complete, EBSCOhost (Accessed on 25 August 2015)
  • Martin, KE 2015, ‘Ethical Issues in the Big Data Industry’, MIS Quarterly Executive, 14, 2, pp. 67-85, Business Source Complete, EBSCOhost (Accessed on 25 August 2015)
  • Jagadish, H., Gehrke, J., Labrinidis, A., Papakonstantinou, Y., Patel, J., Ramakrishnan, R., & ShahabiI, C. (2014) ‘Big Data and Its Technical Challenges’, Communications Of The ACM, 57, 7, pp. 86-94, Business Source Complete, EBSCOhost (Accessed on 25 August 2015)

 

 

 

 

 

Appendix: Characteristics of Big Data

Big Data Dimensions Explanation
Volume

 

Quantum of data generated, stored and used is explosive now. Terabytes, Petabytes +
Variety Data can now be generated through multiple channels. Examples such as, Facebook and Twitter, call centres, chats, voice data, video from CCTVs of retail outlets, IoT, RFID, GIS, smart phone, SMS, etc.
Velocity Real-time data is accessible in many cases such as mobile telephony, RFID, Barcode scan downs, Click stream, online transactions and blogs. The data generated from all such sources can be accumulated with the speed at which they are generated.
Veracity Authenticity of the data increases with automation of data capture. With multiple sources of data, it would be possible to triangulate the results for authenticity.
Validity The terms veracity and validity are often confusing. Perhaps the term validity should be understood as in the market research methodology that the data should be representing the concept that it is expected to represent.
Value Return on Investment and business value are being emphasized more than value for multiple stakeholders.
Variability Variance in the data is often treated as the information content in the data. With a large temporal and spatial data, there can be considerable difference in the data at different sub-set levels.
Venue Multiple data platforms, data bases, data warehouses, format heterogeneity, data generated for different purposes and public and private data sources.
Vocabulary New concepts, definitions, theories and technical terms are now emerging; they were not necessarily required in the earlier context, for example, MapReduce, Apache Hadoop, NoSQL and MetaData
Vagueness It relates to the confusion about the meaning and overall developments around Big Data. Though it is not necessarily characteristic of the Big Data deployment, it reflects the current context. This may change and more clarity is likely to emerge in the future

Challenges of Big Data Deployment (Moorthy et al., 2015, p.76)

 

 

 

 

R You Ready

R – An Introduction
R is a programming language package aimed at data scientist as a tool for computational statistics and visualisation. It has been developed into a popular language and data science programme for finance and data analytical companies. R is part of the open source revolution and has been created and supported entirely by developers and experts worldwide. R has a number of advantages including; every data analysis technique downloadable and free, cutting edge community reviewed methods, stunning data visualisation infographics, faster results with a manageable programme language and expert resources.

http://www.revolutionanalytics.com/what-r

R Code School

The best way to get to grips with R is to take the online tutorial through ‘Try R Code School’. Though basic, it runs through the primary sections and gets you acquainted with the R programming language. The tutorial is pirate themed and this made the sections enjoyable and the pirate in-jokes kept me entertained throughout. The seven sections in the tutorial were:
1. Using R
2. Vectors
3. Matrices
4. Summary Statistics
5. Factors
6. Data Frames
7. Real World Data
After completing each section, I was rewarded with a badge and each topic covered the basics to get me started with real world data sets.

Try R Code School Badges
Try R Code School Badges

Analysing the Data
Having previously worked in finance, I have an inherent interest (and experience) in financial analysis and reporting. I decided to use R programming to take financial data from the Irish Stock Exchange (ISEQ). I decided to focus on Aer Lingus shares over a ten year timeframe.
The first part of my research consisted of analysing the most powerful R packages to analyse my data. I found the most trended of the packages best at extracting financial time series data from internet sources were – Quantmod and Quandl. These packages work in a similar vein to a Bloomberg terminal but at no cost. As I was focusing on historic data, I used Quantmod to extract the data. Quandl would be the preferred package when looking at futures.
http://www.r-bloggers.com/quantitative-finance-applications-in-r/
I installed the Quantmod package from the ‘Packages’ dropdown in R and then tested searching for data using ticker symbols related to Aer Lingus shares – AERL.L.
This command essentially searched google to pull the ticker number ‘AERL.L’ and retrieve any data since 01/Jan/2004. This data is presented as daily log returns as the price; Open price, High price, Low price, Close price and Volume traded.

R commands to pull AERL.L data
R commands to pull AERL.L data

Now we have the data set, it is time to analyse the data to form some interesting information. The first chart I created was to run a time series showing the share price and the volume traded. This provides an illustration of the shares following an almost U-curve between 2006 and 2015.

Time Series of AERL.L data

Time Series of AERL.L data

We have a large data set giving daily prices of Aer Lingus shares over approximately a ten year period. A majority of modelling systems use the data in a XTS command object to extract subsets from the data range. This is widely used when extracting say monthly or quarterly data for additional analysis or reporting. This functionality is an example of how R can be used effectively over older analytical tools.

R commands to create XTS file

R commands to create XTS file and view data sets

Using the capabilities of the data set, I want to plot a graph showing the closing price of the shares. This graph is exactly what would be used to present to management and is an excellent representation of the data set.

R Graph of Closing Prices

R Graph to visualise closing prices of Aer Lingus Shares

An interesting analysis is to plot the daily log return of the closing prices. The resulting time series graph shows the visual impact of volatility in the share price. We can see that during the financial crisis (2008-2009) the share price was in flux and this would be evident of many traded shares at the time. Since 2010, the share price is still fluctuating (though at a lesser rate) and this would indicate instability in the company. Based on remedial research the likely effect has been the recovery in the business since 2010 and the recent speculation of a takeover from International Airlines Group (IAG).

Closing Prices Daily Log Return

Closing daily prices (daily log return)

 

R Summary Statistics

Summary statistics of Closing Prices

http://www.r-bloggers.com/quantitative-finance-applications-in-r-2/
http://www.r-bloggers.com/quantitative-finance-applications-in-r/

Concepts – If I had more time
It would be extremely difficult to propose a new effective financial model in such a short timeframe. In the above example we are not using indicators just taking data to determine market direction or trends. This example has given the power of R at modelling data and presenting the data in an excellent visual format. The data is current and can be easily updated through internet searches.
My analysis is limited in the sense that I have taken past data and from only one company. An excellent way to enhance the analysis would be to take competitor data and plot these against each other. This analysis over time would give an insight into market factors.
Quandl is another programme which looks at futures based on financial data. This would be an excellent programme to create models and predict future prices based on this information. Another way to analyse the data trends would be to analyse internet trends and keywords to see if there is a correlation with market movement. R would be able to analyse large data over time and this could be plotted against the share price chart.

References:

  • Cookbook for R, http://www.cookbook-r.com/ (Accessed: 01 August 2015)
  • Irish Stock Exchange (2015) ‘Market Data’ http://www.ise.ie/Market-Data-Announcements/Companies/Company-data/ (Accessed: 01 August 2015)
  • Playing Financial Data Series(1), Chenangen, (2014) http://www.r-bloggers.com/playing-financial-data-series1/ (Accessed: 03 August 2015)
  • Quantitative Finance Applications in R (Internet Sources), Joseph Rickert (2013) http://www.r-bloggers.com/quantitative-finance-applications-in-r/ (Accessed: 03 August 2015)
  • Quantitative Finance Applications in R (XTS), Joseph Rickert (2014) http://www.r-bloggers.com/quantitative-finance-applications-in-r-2/ (Accessed: 03 August 2015)
  • Revolution Analytics (2015) ‘R is Hot’ Available at: http://www.revolutionanalytics.com/whitepaper/r-hot (Accessed: 03 August 2015)
    Revolution Analytics (2015) ‘What R’ http://www.revolutionanalytics.com/what-r (Accessed: 03 August 2015)
  • Try R Code School (2015) http://tryr.codeschool.com/ (Accessed: 28 July 2015)

R Editor Commands
# Open Quantmod and, xts and moments
library(quantmod)
library(xts)
library(moments) # to get skew & kurtosis

# Searches from Google and pulls data since 01/08/2005
getSymbols(“AERL.L”, src=”google”,from=”2005-08-01″,to= “2015-08-14”);

# Plot a time series chart
Sys.setlocale(“LC_TIME”,”english”);
dev.new();
barChart(AERL.L,theme=”white”);
addBBands();

# Create an xts file of the ISEQ data and return TRUE
is.xts(AERL.L)

# View the dataset
head(AERL.L)
tail(AERL.L)

(AERL.L.Close) # returns TRUE
AERL.L.Close is.xts(AERL.L.Close) # returns TRUE
head(AERL.L.Close)

#Plot a graphic profile of the data
plot(AERL.L.Close, main = “Closing Daily Prices for Aer Lingus Shares(AERL.L)”,
col = “red”,xlab = “Date”, ylab = “Price”, major.ticks=’years’,
minor.ticks=FALSE)

# Set Closing price and Plot data
AERL.L.ret AERL.L.ret

plot(AERL.L.ret, main = “Closing Daily Prices for Aer Lingus (AERL.L)”,
col = “red”, xlab = “Date”, ylab = “Return”, major.ticks=’years’,
minor.ticks=FALSE)

# Set and plot data to find Mean, Std Dev (volatility), Skewness & Kurtosis
AERL.L.ret AERL.L.ret

statNames AERL.L.stats names(AERL.L.stats) AERL.L.stats

Google Fusion Table

Irish Population per County (Source: CSO Census Data)

Google Fusion Tables – Visualise your data

Okay so you have gathered some awesome data and you want to impress your boss with some useful information. Now while bar charts have their place, here is a way to make data visually alive. Thankfully there is a useful application which will do the hard work for you, and impress your boss at the same time.

“Google Fusion Tables is an experimental data visualization web application to gather, visualize, and share data tables.”

https://support.google.com/fusiontables/answer/2571232?hl=en

Google Fusion Tables is a web application tool used to create a visual interpretation of data sets. Data tables can be gathered from public data or imported from your own data. The data is then visualised and can be published and shared on the web. There is a real collaborative feel to the application and the information can be communicated to your target audience with ease.
Google Fusion Tables must firstly be installed by creating a Google account and signing into My Drive. Simply connect Fusion Tables as a new application, for free, and you are ready to begin.

Designing an Irish Population Heat Map
To create a Heat Map of the Irish population by county we needed two specific data tables, namely:
• Population figures by county (csv. file)
• Counties of Ireland data map (kml. file)
Now there are various ways these can be created but for this Heat Map the Population figures were taken from the most recent CSO database, which was taken in 2011.
http://www.cso.ie/en/statistics/population/populationofeachprovincecountyandcity2011/
The Map data was derived from a KML data file and contained geometry data on all the counties in the Republic of Ireland. This data was used to essentially plot the county boundaries in Google maps.
http://www.independent.ie/editorial/test/map_lead.kml

The next step was to cleanse the data which is important for any data exercises. The data from the CSO population table was converted into an Excel document and it was noticed that some of the counties included subsets which needed to be amended. The ‘State’ and ‘Provinces’ were removed and the data for Tipperary North and South was combined into one county. This left the data with 26 counties and corresponding population figures for each county.

The KML file was downloaded into Fusion Tables and there were 99 rows in total. This was the geometry data for the counties. This step is very important as the data from the two tables must be compatible or the files will not merge correctly.
These tables were uploaded into Google Fusion Tables ready to be ‘Merged’. This is where the power of Fusion Tables comes into its own. The Map file was opened and from File

A new Tab was created, with the merged data given a visual representation of the Population of Ireland by County for 2011. At this point, the map needs to be edited to give the Heat Map some visual meaning. It was decided to distribute the counties into six buckets based on population density. The figures were distributed as; 0 – 75,000 (6 counties), 75,000 – 100,000 (4), 100,000 – 125,000 (4), 125,000 – 180,000 (6), 180,000 – 250,000 (3) & 250,000 – 1,273,070 (3). Though this was not evenly distributed, the counties were easier to distinguish and the map had a clearer visual impact. The counties could have been evenly distributed by breaking the data from the Population table into even sets and represented in this fashion. Each bucket was given a colour which was incrementally darker as the population density increased. A legend was created and gives the Heat Map more context when distinguishing the county’s population numbers.

I have made my data public and this is an important feature of Google Fusion Tables. Anyone can now take my data use this to carry out further research on population in Ireland.

Irish Population Data in action
The Heat Map of Irish population could be used in a number of interesting ways depending on data gathered. The CSO website has a number of detailed databases with well-presented data sets on a number of topics including; housing, health, education, labour market, tourism and transport. These would be used on a macro level to for the government to decide on future spending requirements in certain areas. The country is experiencing a housing shortage and the government are expected to deliver social housing projects. To identify the biggest number of social housing areas needed the government would use a combination of social housing applicants by area. Plotting these two data sets would give a nationwide Heat Map and identifying the most needed areas on a more local scale. The KML data would need to be more granular to target specific areas within counties. A well-presented Heat Map would give an excellent representation specific area shortages and therefore where funding is most needed.
Taking data from the 2011 CSO Census, another heat map was created showing the Vacancy rates of Housing per County. The Heat Map below shows properties that re left vacant per county. This is another example of using CSO data to present a visual Heat Map in reporting social issues.

Further Practical uses for Fusion Tables
Google Fusion Tables have a variety of functions for making a visual interpretation of your data. Scene perception studies have proven people show an increased understanding of pictures based on colour. The most recognisable is for representing weather. News and weather reports are presented with predicted weather patterns and forecasts. This visual information is consumed and used for; sea crossings, floods, farming, heatwaves, icy roads, planning journeys.
An excellent use of Heat Maps has been on the research of Global Warming patterns. Predictive maps are powerful when publishing outcomes. The psychological impact of seeing the global warming patterns is proven to help with understanding and give meaning to, often complicated, data sets.

Conclusion
Google Fusion Tables is an excellent application to present data in a clear visual format. The application is extremely useful for taking geometry data in a KML file and creating a Heat Map using Google Maps. This visual representation, when shared, is an interactive way to present your data to a wider audience. The collaboration element gives the opportunity to enhance data and findings based on original data sources. The application has the potential to give a greater understanding of data sets, in a user friendly visual format.

Bibliography