News Update :

In New York, an institute on Big Data will stimulate research and the economy of the city

By: Unknown on Sunday, 18 January 2015 | 13:10

Sunday, 18 January 2015

The Institute for Data Science and Engineering will focus on smart cities, new media, medical analytical processing, financial analytical processing and cyber security.

Supported by a grant of $ 15 million recently awarded by the City of New York, Columbia University develops the Institute for Data Science and Engineering (IDSE) to deepen the field of analysis of Big Data, large amounts of data. Announced this summer, the new institute is supervised by Columbia Engineering School. He received this grant as part of the New York City initiative for applied science. Ultimately, the goal is to stimulate the economy of the city and market innovations that may emerge in the new institute.

The latter will focus on five areas: smart cities, new media, medical analytical processing, financial analytical processing and cyber security, says Columbia magazine.
Columbia researchers and professors are already talking about them for their creative analysis (and practice) of Big Data, from journalism professors who use data mining programs to identify patterns in the electoral discourse to the medical researchers studying how genetic data could help doctors tailor drug treatments in the future.

The university is committed to raise at least $ 80 million in the private sector and recruit 75 new teachers and assistants to the institute by 2030. These will occur on all disciplines outside of engineering, looking for solutions to concrete problems via the analysis of very large data sets.
An analysis of the economic impact made by the Economic Development Corporation of New York projects that the institute will generate $ 3.9 billion of activities for the city over the next thirty years, reports the Columbia magazine.

The first phase of establishment of the institute will be completed in 2016, when a physical home location for the institute will open in the Columbia Mudd existing building, which also houses its engineering school (additional space will be taken on the Northwest Corner Building Building of Columbia, which is still under construction in the future, the Institute will extend Audubon building on the campus of Columbia Medical School).

In writing this article a few days after Hurricane Sandy had hit the East Coast of the United States, I could not help thinking that given the recent turmoil experienced by New York, it will be particularly fascinating to see whether and how the five key areas of the institute will converge in the context of the damage left by Sandy. How data related to Sandy they affect the development of smart cities , journalistic coverage of disasters and future monitoring of economic and medical concerns modes of a city beset by unprecedented storms?

In many ways, the fact that this institute is based in New York could offer unique local opportunities in terms of analysis of post -Sandy data and action based on these data ... opportunities that could have national implications and world.
comments | | Read More...

Big Data : Darpa is testing a method of treatment

By: Unknown on Saturday, 17 January 2015 | 13:05

Saturday, 17 January 2015

Our computers on wheels, formerly called car, reach critical mass in terms of global connectivity. Ford has just announced that delivered more than 5 million vehicles equipped with the SYNC connectivity system developed with Microsoft there five years. For the manufacturer, this connectivity is the gateway for the car to become "the most intelligent well in your possession."

"SYNC has helped us evolve as a car manufacturer, to think and act more like a technology company, with a new level of openness and access that has forever changed the way we view our business and respond to our customers, "said Paul Mascarenas, CTO and vice president of the branch of Ford Research and Innovation.

"We turned the car into a platform offering vast opportunities for developers working with us continue to add value through new features brought to the rhythm now expected by consumers. With more than one billion smartphones in use today in the world, we believe that mobile connectivity will continue to be the cornerstone of our future strategy. »

SYNC has been designed so to adapt and change with new technological developments. While cars and trucks usually remain outstanding for more than 10 years on average, consumers often replace their electronic devices every two or three years to keep up with the latest technological advances. The SYNC development team created an architecture based on the Windows Embedded Automotive platform that supports open protocols such as USB and Bluetooth so that most devices can connect for media playback and communications.

When SYNC was first announced January 7, 2007 at the International CES, the presentation included iPod, Motorola RAZR flip phone and Palm TREO smartphone. Just two days later, Apple revolutionized the mobile phone and gave birth to the economy of applications by announcing its first iPhone.


New capabilities
When customers began driving the first car equipped with SYNC, the Focus 2008, in the fall of that year, most of them used SYNC to make calls hands-free with their feature phone and listen music from an iPod with voice commands provided by Nuance voice recognition technology.
Five years later, Ford noted that smartphones are powered by a variety of platforms, including mobile operating systems iOS, Android, Blackberry and Windows Phone. "With their built-in storage, processing power that rivals desktop computers there are five years old and fast wireless data connections, these phones still work with these vehicles equipped with SYNC original," says the automaker, now or computer manufacturer.

The system also supports the new capabilities available on most Ford vehicles like AppLink, the emergency service 911 Assist, Vehicle Health Report reports and SYNC Services, a network of services based on cloud computing that provides reports on traffic, turn-by-turn guides of the search business, news, sports scores and movie listings.

"The car is a rich source of real-time data"

"It is clear today that the development of an open and scalable connectivity platform was crucial to the success of SYNC because it has allowed us to keep up with the consumer," said Paul Mascarenas. "With SYNC, Ford vehicles are no longer confined to technology at the plant, they can keep pace with changing trends general public via simple software updates. »

Connectivity of cloud computing, embedded sensors and data access create intelligent vehicle experience. Other advances, such as natural language processing and machine learning, could allow SYNC provide a more natural interaction between car and driver, enabling a more personalized driving experience and practice. "The car is a rich source of real-time data; combined with the processing power available in the cloud, it could become the smartest although you will ever own, "predicts Paul Mascarenas.
comments | | Read More...

Crowdsourcing is exceeded : discover later

The exploitation of the "wisdom of crowds" is a way to search for new ideas, but beyond the crowdsourcing, developments related to technology emerge.

Crowdsourcing, that is to say the operation of the "wisdom of crowds" is regarded as a modern way to go for innovation and new ideas from outside the company, as well as take advantage of the confrontation between multiple disciplines. 
 
 
 
Companies and leading organizations such as Dell and NASA employ crowdsourcing approaches to give birth to new innovations.

However, if the technology focused crowdsourcing to a new level, it also makes the approach already outdated and potentially risky. Such is the observation by Thomas Frey, futurist at DaVinci Institute and editor of the site FuturistSpeaker.com. He poses this provocative question: "If you had the choice between making the trip from Boston to San Diego in a plane piloted by a single machine or by the combined intelligence of 3,000 people, which option would you choose? ".

Thomas Frey noted that the "crowd" has often deceived, from a historical perspective. For example, he recalls that have followed the wisdom of crowds that led to the 1929 stock market crash and the collapse of the real estate market of the 2000s.

He says the crowdsourcing loses its luster as a means of innovation and decision making, and is being replaced by three developments favored by technology.

The rise of opinion leaders: it is active voice in their social networks, and marketers are learning to convince these very influential people "to recommend their products and services more positively," says Thomas Frey.

The rise of Big Data analytical processing (large amounts of data): Big Data phenomenon accelerates decision making, from slow and deterministic approaches (such as steering committees or boards of directors) to "technical data mining to know almost instantly opinions vast constituencies or purchase whims of a target market, "says Thomas Frey. "Excessively slow processes of yesterday are replaced by an entirely new social norm. »

The rise of enriched human intelligence by machine: perhaps paradoxically, technology and artificial intelligence also help to raise individual human intelligence, says Thomas Frey. "As we move into the era of large amounts of data, we will continue to explore more sophisticated ways to capture and exploit the wonders of the human mind and find unusual ways to use only those with the more intelligence. »

Thomas Frey acknowledges that the decision making is enhanced by new technologies, but that crowdsourcing can continue to have a role to play in stimulating innovation. Many businesses and crowdsourcing communities, for example , limit the commitment to closed participatory networks. Thus, companies that test crowdsourcing employ vis-à -vis approach to their employees or partners, ensuring more relevant and targeted ideas.

There are also crowdsourcing initiatives limited to scientists or researchers. Crowdsourcing is part of an extensive set of tools available today, via technology, to achieve faster and more innovative solutions ... provided they are used correctly.
comments | | Read More...

Why algorithms can not do without humans to predict the weather

By: Unknown on Friday, 16 January 2015 | 12:35

Friday, 16 January 2015

Meteorology relies as much on human analysis on computer programming: a combination of computer and human intelligence produces the most accurate weather forecasts.

History is full of intellectuals who worshiped the theories of determinism, ie the ideas that suggest that if we could know every aspect of a situation, every detail of the context, we can predict and influence policy outcomes, economic and cultural future.

 
 
 
However, in terms of weather, forecasters have long since abandoned any hope of cataloging all the variables that could impact with precipitation in Seattle or the arrival of a cold front in New York. This is at least what Nate Silver writes in his new book, "The Signal and the Noise: Why So Many Predictions Fail - but Some Do not", an excerpt was adapted to a recent New York Times Magazine.

To believe his claims, weather forecasts fall somewhat occult sciences. Despite all the measures, modeling and statistical analysis, meteorology relies as much on human analysis on computer programming. The best proof is the historical record of the National Weather Service (meteorological service of the United States). According to the agency's data, it is a combination of computer and human intelligence that produces the most accurate weather forecasts. People improve accuracy levels for temperature and precipitation forecasts by about 25% and 10% respectively compared to forecasts made by computers alone.

In other words, the algorithms have not yet defeated us.

While modern futuristic imagine an era where computers will lead a more advanced thinking individuals, it turns out that the human spirit will always have a role to play. In weather forecasting, even the most sophisticated computer modeling systems contradict constantly. This is for people who study these models to bring the nuance and add context, be it how to best weigh the variables that determine where a storm is moving or the morning mist north -is tends to dissipate quickly when the wind blows in a certain direction.

No matter how powerful, the computers can not just "see". In addition, they are not necessarily as good as humans to know where and when to seek further information. Our obsession with large volumes of data, big data, and quantification of industries (finance, advertising, aerospace) can sometimes blind us to the fact that the human perception and analysis, as vague and imprecise as they are, remain essential to the progress of society. Perhaps in the future, our descendants will laugh at our fixation on numbers. Or maybe they just recognize better than us that numbers are only part of the equation.    
comments | | Read More...

A European project to help SMEs manage big data

Since early October, the Neuchâtel University Institute of Computer Science of coordinating a European research project on the management of large amounts of web data for small and medium enterprises.
 





The web grows each day of a large mass of data equivalent to 8 times the total US library catalog. For example, 35 hours of video are uploaded to YouTube every minute. But to take advantage of these data, SMEs face high costs as extract and process this information requires huge storage and computing capabilities.

To solve this problem, the European project LEADS was established. It is dedicated to the management of large amounts of data to the Web for small and medium enterprises. With a total of 4.25 million budget for a period of three years, he has four universities and three companies spread across 5 countries. It is coordinated by the University of Neuchâtel since early October under the responsibility of Stephen Riviere, lecturer at the Institute for Informatics at the University of Neuchâtel (IIUN) in the Professor Pascal Felber team. For this project, the IIUN receive an EU grant of almost 750,000 euros.


Development of data encryption and micro-cloud federation


In the spirit of the cloud, the strength of this project lies, the statement of the University of Neuchatel, "in its ability to ensure the confidentiality of private data. Encrypted by a key that is not known to the LEADS infrastructure, these data may be subject to treatment and queries, which are themselves encrypted. "According to Etienne River," this is particularly innovative because it uses implement encryption techniques that have been proposed recently or will be developed within the project. This typically involves comparisons, sorts, or extractions, which constitute the essence of the great masses of data processing operations. "

In addition, the project will promote LEADS micro-cloud federation geographically distributed so that transactions are in locations close customer or source of data. The objective is to avoid performing processing in the French part of the web and store the data in Brazil, for example, explains Etienne River: "This would generate huge transatlantic data flows and very few connections from Latin America. Locate the same storage in a high density area would be speaking much more advantageous. "Professor Felber team's researchers will therefore study the incorporation of the town of access and treatment. For this they use servers offered this spring to IIUN by Yahoo !, Company in association with the infrastructure of other members of the consortium.
comments | | Read More...

Big data : 4 questions about your network

By: Unknown on Thursday, 15 January 2015 | 12:18

Thursday, 15 January 2015

CIRCLE. The Big data transforms the way organizations collect data 

and use decision-making tools, giving access to information that improve productivity, 

innovation and competitiveness. So it's no surprise that he is high on the agenda for many 





IT organizations. However, implementing the Big data has some challenges.

The Big data initiatives lead to large databases, elastic horizontally deployed on clusters can span thousands of servers. The network that connects these servers and design are key. A poorly designed network can inhibit the launch and expansion of big data initiatives, penalizing the value they could provide. Here are four questions you need to ask yourself (and your network equipment manufacturer).

1) How do I ensure the reliability of data?


The file system Hadoop Distributed File System (HDFS) places each data on a node in a rack, then the replica on two different nodes in different racks. HDFS having knowledge of the topology, the replicas are placed on nodes connected to different switches. No additional intelligence does not need to be integrated with switches.

In case of failure, HDFS is based on the network to maintain reliability. However, data replication can take hours. For example, it takes about 66 minutes to transfer 30 TB of data on a network 1GbE - taking into account network latency and according to the actual performance of the network, it can take much longer. If the network does not have enough bandwidth, the risk is that the data node loses connectivity with the naming node and then lose HDFS reliability.

With a 10Gbps infrastructure, replication of a disk 3 TB takes only 7 minutes. Moreover, network performance is far superior (see below), ensuring the availability of bandwidth required for reliable connection data nodes and naming.

2) How can I ensure that performance will be good enough?

Traditional multi-tier networks are inefficient when it comes to building support application performance multirack Big data. The latency of a typical frame of switch may be only one microsecond, but the latency of the switches and distribution network heart is significantly greater.

The problem is compounded by the oversubscription introduced at the distribution and network heart. A frame switch is operable with an oversubscription ratio of 3: 1. But if we consider an oversubscription ratio of 4: 1 at the distribution switch, we get an overall oversubscription of 10: 1. Therefore, the actual available bandwidth for replication, a data node configured with a connection 20 Gbps, is only 2 Gbps.

In addition, with Big data application, it is not acceptable to lose packets when the data is moved within the data center. On an Ethernet network, the answer to this problem is usually provided by the technology data Center Bridging (DCB). But all switches currently available on the market do not support it.

Adequate infrastructure should not use intermediate switches so that each server is only a hop from the other, which significantly reduces latency. Eliminate intermediate switches also reduced substantially oversubscribed. Which indicates the lowest delay requirements of the traffic and an increase in flow rates.

3) How to make sure my big data project could be extended?

Elasticity is essential for big data initiatives. However, the network becomes more extensive, more specific restrictions on traditional infrastructure induce latency and oversubscription.
But we must also consider the issue of data collection. Who says big data initiative, said collection of structured and unstructured data in real time. Therefore, the network must support direct access, which is ideal, storage networks; that some network architectures do not allow.

Choose an architecture capable of elasticity to levels 2 and 3, which can be optimized for FCoE SANs, iSCSI, and NFS, ensure you do not have to worry about your growing needs.

4) How can I easily administer?
Traditional multi-tier networks are inherently complex to administer. The number of intermediate switches increases with the number of servers and racks. Therefore, both the administration and maintenance of the network become more complex.

Authentic matrix based on a single operating system, behaves as a converged Ethernet switch and can be administered as such. Provisioning, administration and maintenance are greatly simplified.

Ask the right questions

There is no doubt that the big data initiatives will yield the necessary information to the piloting of new activities and the emergence of new ways of doing business, when they begin to materialize. But to ensure that networks do not constitute an obstacle, organizations must ensure that they have asked the right questions about the design of their network.
comments | | Read More...

"Big data" : a collaborative and agile way of working that accelerates the development of business

CIRCLE. "Big Data" projects make the information understandable, usable, sharable and open.
 




They are the best tool, as expected, for more transversalities and a real collaborative work in the company.
Another change, the time of "Big Data" is fast: 5-8 weeks to extract the data, analyze, build reliable models and trigger the first actions. All this put out for and by "task forces" to actually allow cross mix of experience and cultures, increase diversity and therefore the richness of the analyzes. Finally a real technological leverage to the matrix organization who suffered from its origins in the non-sharing or a misunderstanding of information between forces from the "silos" functional or geographical trades of the past.

A project "Big Data" is to marry large volumes of detailed and different data to build in 5 to 8 weeks of analytical models. For example, we will study consumer behavior of a site based on page views, of prequalified products, the waiting time between two actions and time of connection or connect these behaviors to comments posted on blogs. We will thus build models to better target potential and act quickly (offers, reminders, additional services calls).
Companies can now understand complex phenomena and especially to share these analyzes to increase their collective intelligence.

es unpublished analysis capabilities that facilitate collaborative work

If the granularity is not a new concept (on "zooms" to the finest data to develop specific analysis: profitability command line, followed by the physical movement of products, analysis of the behavior of a consumer ...), the "Big Data" provides an ability to cross and detailed analysis, to discover previously unknown information. An immediate example is the use of data from smart meters in the field of energy distribution.
Access to such power analysis on various data sets provides true added value: it includes behavior, detecting trends, "patterns", groupings. If the "Data Mining" wore on samples technologies "Big Data" we offer the ability to discover trends and relationships on an entire population. It becomes possible to test hypotheses in real time and "give facts" decisions. And while hundreds of attributes yesterday characterized client or profile, tomorrow are potentially thousands of attributes that businesses will have.

Share a complex environment

Traditional analytical systems - symbolized by the dashboards - propose foremost figures and indicators on "silos", the interpretation can sometimes be difficult. These tables do not render account complex universe as car factories, for example, where you have access near real-time to 'human resources' data (presence of teams), logistics, robotics, using internal standards and those of suppliers , married flow data by asset, per person. A picture is much more meaningful than tables, it can act and communicate. Another example, what could be simpler than adding a photo taken with a phone to an insurance statement?

Such analyzes require close collaboration between marketing, sales, call centers, logistics, management controllers ... Can not do it alone, nor even to withhold information that could benefit other directions. With computers as arbitrator, especially as resources are scarce, transversality is growing around objective, and "Big Data" and embodies the collective intelligence of an organization.

Newfound agility

The formats of data from new devices - mobile devices, sensors / RFID chips, "Open Data", Web applications, blogs, social networks - are very different, but their operations are almost more problematic, although some techniques remain refine (as content extraction videos).
Development prospects are virtually unlimited because now depend on the company's ability to imagine new combinations. These sources are starting to be used in insurance, security: social networks are used to verify the statements. Some brands are able to compare their sales and the names of their products on the Internet (blogs, content sites) or in "tweets". With the coordinates of these "tweets", they can quickly contact potential clients. Last example, "call centers" analyze voice recordings.

A rapid and iterative fashion project

The five phases of a project type "Big Data" are: ROI study of the project, Extraction, Construction of models / scenarios, Analysis and Action. All in less than 2 months. We are far schedules ERP establishment projects. We have instead a quick projects galaxy, enriched each other (a model of commercial profitability can be taken up by logisticians and buyers to extend the analysis to the Supply Chain).

The "Big Data" is nothing without visualization technologies that accompany

Today, companies are reviewing their portfolio of analytic applications with a taste for the "best of breed". The current software market indeed offers a constellation of innovative technologies that meet each has one or more specific needs. So there are solutions for every company which exploits agile so varied and detailed data at their disposal.

Since the data are manipulated, the "Big Data" is also synonymous with Proactivity: it is easier to establish scenarios, simulations. For example, what would be the impact of promotion on the city of Marseille during this cultural event (with attendance rates recovered Sites "Open Data" of the city) and depending on the weather? I can simulate the evolution of sales, margin and therefore my factually discuss further with distributors in the cost of promotion.

In addition, traditional dashboards will gradually mutate to tools offering real information rather than just aggregate data - this is the concept of "Discovery Analytics" to control the operations and not just the final performance. Here too the "Big Data" will help revolutionize the business.

The revolution is on the uses


The ability to combine more data will strengthen the cross in the business. And over transversalities increase responsiveness and faster cycle times - faster response to customer requests, to changing needs, questions from partners and suppliers. This transforms our modes of collaboration and challenges the traditional hierarchy. Transverse functions (Purchasing, HR, Marketing, Finance ...) will put their data, intelligible, serving business; their intervention in the processes of daily decisions will be different.

This transversal strengthens the autonomy of decision and action operational. For example, a business will instantly see that a customer is not profitable and especially why: too many deliveries, an unbalanced mix product compared to other comparable customers. It will be autonomous to understand the issues and be able to act quickly as a result.

This transformation process will inevitably be accompanied by an evolution of construction methods of data warehouses and even project management methods and techniques architectures. The trend is the decentralization of the given component elements, leading ultimately to a purely logical view (we think more "Data Center", but "data", somewhat in the manner of "Cloud Computing" today) . In addition, operational will reclaim the computer skills so necessary for their actions. Consequently, it is also the landscape of the IT Department should be transformed in the coming years.
More surprisingly, the company is moving boundary. Who is the data that build decisions: to customers? In the public space? Where does the company's responsibility always to the employee? Companies are increasingly interconnected closely with their customers, suppliers, personal networks of their employees and the public space. They appear more like a mobile Confederation (suppliers, manufacturers, customers, employees, investors, public space) around a brand, service or product.

Scarce skills

If the "Big Data" from a technological evolution, so he prepares a profound revolution. The first applications show the potential of exploitable last available data (smart meters do not invent the consumption data - nothing new here - but the use we can do is fantastic: a reduction of the invoice, optimization the network security of the premises).

We will see the emergence of a real market data, where corporate data and personal data will value them; which poses safety and security issues that a few countries such as France began to regulate.
Our companies will further progress in cross, the trigger "task forces" and empower operational. There is still some way to go and the technology market will mature. Some issues must be addressed urgently, such as the availability of skills (an acute problem for 46% of the professionals surveyed in the last TDWI survey) 1. The first to evolve rapidly take several steps ahead.
comments | | Read More...

Big data : to avoid the nightmare, strengthen your network performance and build on the best profiles !

By: Unknown on Wednesday, 14 January 2015 | 11:05

Wednesday, 14 January 2015

CIRCLE. Stéphane Duproz - Access to the internet, everywhere and for all, 

the proliferation of digital devices and services generated a phenomenon 

that the Anglo-Saxons call "Big data" or "big data." That is to say inflation data such 

that it becomes difficult or impossible to manage in conventional databases.

 



For businesses, big data can become a nightmare, a "Fat data" or on the contrary a great opportunity if they know to take it. The nightmare, one perceives the good: invaded by the data (for example, 3 billion documents are exchanged each month on Facebook), not knowing how to deal with, the company is content to invest in storage arrays which over the months will take on the appearance of these linear kilometers of archive documents languishing in the basement of some large companies and administrations. This vision is a huge mess, because data that is no longer in use becomes a burden and especially a cost center.

Success through a high-performance connectivity

The challenge for the company will be to keep the data alive so that they are used effectively, which requires the establishment of systems for dynamic traffic and analyzing information. In this context, neutral data centers offer the ideal environment to streamline servers, organize flows and also host and protect data. This implies of course a high performance data center connectivity, can offer not only access to various operators, but also an interconnection "seamless" with other data centers. To illustrate the importance of the performance of connectivity, we can take the example of high-frequency trading, which is to run the computers of financial transactions in an extremely short time (on the order of microseconds). A faulty connection or less efficient than the competition can cost an operator tens of millions of euros.

Downstream, the company will use business intelligence solutions because it must be able to make available to the right people - customers, employees, suppliers - the right information at the right time. And that's where the Big data is a tremendous opportunity for the company because it has more organized data, the more it will be able to use it in a useful way to develop its business. A greater number of available data, it is more about customers, prospects, competitors and therefore the possibility to target production and marketing campaigns.

The Big data will lead to the creation of skilled jobs


A study a few months ago by McKinsey shows how the Big data can become an extraordinary item in productivity for the company. Health, retail, government, manufacturing ... all these sectors have much to gain if they learn to use their data. Analysts estimate and a distributor that will take advantage of big data will see its operating margin to increase by 60%. But the Big data has an impact on the company's performance. It is also us, said the study, an incredible source of employment: nearly 200 000 analysts and over 1.5 million managers capable of analyzing the data will be needed by 2018 than anything States States.

Admittedly, the road is still long enough until all companies have not only understood the issue, but mostly have implemented the means to achieve it. For directions of information systems, data management, particularly the Big data, will remain a puzzle. Especially since the arrival in the working world of Generation Y, strong producer and consumer of unstructured data will amplify the phenomenon. It is in this context that the neutral data centers whose heart business is the management of highly connected environments are emerging as the physical pillar of Big data. Because the future lies in the efficient processing of data, the Big data tomorrow will be an asset for companies that have invested the necessary resources to manage it.
comments | | Read More...

Download

Blog Archive

 
Contact Us | Privacy policy | Term of use | Advertise with Us | Site map
Copyright © 2011. Blogging Brain . All Rights Reserved.
Design Template by Blogging Brain | | Powered by Blogger