A New Kind of Science is a best-selling, [1] controversial book by Stephen Wolfram , published by his own company in 2002. It contains an empirical and systematic study of computational systems such as cellular automata . Wolfram calls these systems simple programs and argues that the scientific philosophy and methods are appropriate for the study of other fields of science. Continue reading “A New Kind of Science”
Natural language processing
Natural language processing ( NLP ) is a field of computer science, an artificial intelligence that deals with the interaction between computers and human (natural) languages, and, in particular, concerns with programming computers. Continue reading “Natural language processing”
Hydroinformatics
Hydroinformatics is a branch of informatics qui concentrates on the implementation of information and communications technologies (ICTs) in Addressing the increasingly serious problems of the equitable and efficient use of water for Many different practical purposes. Growing out of the former discipline of computational hydraulics , the numerical simulation of water flows and related processes remains a mainstay of hydroinformatics, which encourages a focus not only on the technology and its application in a social context. Continue reading “Hydroinformatics”
Humanistic informatics
Humanistic Informatics (also known as Humanities Informatics ) is one of several names chosen for the study of the relationship between human culture and technology . The term is fairly common in Europe , is little purpose Known in the English-speaking world , though Digital Humanities (Also Known As Humanities Computing ) is in Many boxes Roughly equivalent. Continue reading “Humanistic informatics”
Museum informatics
Museum informatics [1] is an interdisciplinary field of study that refers to the theory and application of informatics by museums . It is in essence a sub-field of cultural informatics [2] at the intersection of culture , digital technology, and information science . In the context of the digital age, museums and archives, its place in the world has grown substantially and has connections with digital humanities . [3] Continue reading “Museum informatics”
Viroinformatics
Viroinformatics is an amalgamation of virology with bioinformatics , involving the application of information and communication technology in various aspects of viral research. Currently there are more than 100 different applications concerning diversity analysis, viral recombination, RNAi studies , drug design , protein-protein interaction , structural analysis and so on. [1] Continue reading “Viroinformatics”
Systems biology
Systems biology is the computational and mathematical modeling of complex biological systems . It is a biology -based interdisciplinary field of study that focuses on complex interactions within biological systems, using a holistic approach ( holism instead of the traditional reductionism ) to biological research. Continue reading “Systems biology”
Stylometry
Stylometry is the application of the study of linguistic style , but it has been successfully applied to music [1] and to fine-art paintings [2] as well. [3] Continue reading “Stylometry”
Simulation governance
Governance is a managerial function concerned with assurance of reliability of information generated by numerical simulation . The term was introduced in 2011 [1] and specific technical requirements were addressed from the perspective of mechanical design in 2012 [2] . Its strategic importance was addressed in 2015 [3] [4] . At the 2017 NAFEMS World Congress in Stockholm has been identified as the first of eight ” big issues ” in numerical simulation . Continue reading “Simulation governance”
Semantic analysis (computational)
Semantic analysis (computational) is a composite of the ” semantic analysis ” and the “computational” components.
“Semantic analysis” refers to a formal analysis of the meaning, and “computational” refer to approaches that in principle support effective implementation. [1] Continue reading “Semantic analysis (computational)”
Geoinformatics
Geoinformatics is the science and the technology which develops and uses information science infrastructure to address the problems of geography , cartography , geoscience and related branches of science and engineering. Continue reading “Geoinformatics”
Geographic information system
A geographic information system ( GIS ) is a system designed to capture, store, manipulate, analyze, manage, and present spatial or geographic data . The acronym GIS is sometimes used for geographical information science (GIScience) to refer to the academic discipline that studies geographic information systems [1] and is a broad domain within the broader academic discipline of geoinformatics . [2] What goes beyond a GIS is a spatial data infrastructure , a concept that has no such restrictive boundaries. Continue reading “Geographic information system”
Financial modeling
Financial modeling is the task of building an abstract representation (a model ) of a real world financial situation. [1] This is a mathematical model designed to represent (a simplified version of) the performance of a financial asset or portfolio of a business, project , or any other investment. Financial modeling is a general term that means different things to different users; the reference report is for accounting and corporate finance applications, or for quantitative finance applications. While there is some debate in the industry to the nature of financial modelingtradecraft , such as welding, or a science -the task of financial modeling has been gaining acceptance and rigor over the years. [2] Typically, financial modeling is understood to be an exercise in financial asset pricing, or a quantitative nature. In other words, financial modeling is about translating a set of hypotheses about the behavior of markets or agents into numerical predictions; for example, a firm’s decisions on investments (or the firm will invest 20% of assets), or investment returns [3] (returns on “stock A” will, on average, be 10% higher than the market’s returns). Continue reading “Financial modeling”
Environmental informatics
Environmental information is the science of information applied to environmental science . As a result, it provides the information processing and communication infrastructure to the interdisciplinary field of environmental science [1] , and the use of information and knowledge integration , the application of computational intelligence to environmental data and the identification of environmental impacts of information technology . The UK Natural Environment Research Councildefines environmental informatics as the “research and system development on the environment sciences relating to the creation, collection, storage, processing, modeling, interpretation, display and dissemination of data and information.” [2] Kostas Karatzas defined environmental information as the “creation of a new knowledge-paradigm ‘towards serving environmental management needs.” [3] Karatzas is an integrator of science , methods and techniques and not just the result of using information and software technology methods and tools for serving environmental engineering needs. ” Continue reading “Environmental informatics”
Disease informatics
Disease Informatics is the implementation of information science in defining the diseases with least error, Identifying MOST of the targets to fight cluster of diseases (Disease Causal Chain), and designing a holistic solution ( Health strategy) to the problem. [1] Continue reading “Disease informatics”
Data science
Data science , also known as data-driven science , is an interdisciplinary field of scientific methods, processes, and systems to extract knowledge or insights from data in various forms, which is either structured or unstructured, [1] [2] similar to data mining . Continue reading “Data science”
Computer simulation
Computer simulations reproduce the behavior of a system using a mathematical model . Computer simulations have become a useful tool for the mathematical modeling of many natural systems in physics ( computational physics ), astrophysics , climatology , chemistry and biology , human systems in economics , psychology , social science , and engineering . Simulation of a system is represented as the running of the system’s model. It can be used to explore and gain new insights into new technologyand to estimate the performance of systems too complex for analytical solutions . [1] Continue reading “Computer simulation”
Computational X
Computational X is a term used to describe the various fields of study that have emerged from the applications of informatics and big data to specific disciplines. Examples include computational biology , computational neuroscience , computational physics , and computational linguistics . Continue reading “Computational X”
Computational transportation science
Computational Transportation Science (CTS) is an emerging discipline that combines computer science and engineering with modeling, planning, and economic aspects of transportation . The discipline studies how to improve the safety, mobility, and sustainability of the system by taking advantage of information technologies and ubiquitous computing . A list of subjects encompassed by CTS can be found at include. [1] Continue reading “Computational transportation science”
Computational topology
Algorithmic topology , or computational topology , is a subfield of topology with an overlap of areas of computer science , in particular, computational geometry and computational complexity theory . Continue reading “Computational topology”
Computational thinking
Computational thinking is a method of thinking that is problematic and expressing its solution (s) in such a way that a computer-human or machine-can effectively carry out. [1] Continue reading “Computational thinking”
Computational sustainability
Computational sustainability is a broad field that attempts to optimize societal, economic, and environmental resources using methods of mathematics and computer science fields. [1] Sustainability in this context is the ability to produce enough energy for the world to support its biological systems. Using the power of computers to process large quantities of information. [2] Continue reading “Computational sustainability”
Computational Statistics & Data Analysis
Computational Statistics & Data Analysis is a monthly peer-reviewed scientific journal covering research and applications of computational statistics and data analysis. The journal was established in 1983 and is the official journal of the International Association for Statistical Computing , [1] section of the International Statistical Institute . Continue reading “Computational Statistics & Data Analysis”
Computational statistics
Computational statistics , or statistical computing , is the interface between statistics and computer science . It is the area of computational science (or scientific computing) specific to the mathematical science of statistics . This area is also rapidly expanding to include a broader concept of computing as part of general statistical education . [1] Continue reading “Computational statistics”
Computational social science
Computational social science refers to the academic sub-disciplines concerned with computational approaches to the social sciences . This means that computers are used to model, simulate, and analyze social phenomena. Fields include computational economics , computational sociology , cliodynamics , culturomics , and the automated analysis of contents, in social and traditional media. It focuses on social and behavioral interactions and interactions through social simulation , modeling, network analysis, and media analysis. [1] Continue reading “Computational social science”
Computational semiotics
Computational semiotics is an interdisciplinary field that applies, conducts, and draws on research in logic , mathematics , the theory and practice of computation , formal and natural language studies , the cognitive sciences , and semiotics proper. A common theme of this work is the adoption of a sign-theoretic perspective on issues of artificial intelligence and knowledge representation . Many of its applications lie in the field of human-computer interaction (HCI) and fundamental devices of recognition. Continue reading “Computational semiotics”
Computational semantics
Computational semantics is the study of how to machine the process of Constructing and reasoning with meaning representations of natural language expressions. It plays an important role in natural language processing and computational linguistics . Continue reading “Computational semantics”
Computational scientist
A computational scientist is a person skilled in scientific computing . This person is usually a scientist , an engineer , or an applied mathematician who uses high-performance computers in different ways to advance the state-of-the-art in their respective applied disciplines; physics , chemistry , social sciences and so forth. Thus scientific computing has many influences such as economics, biology, law and medicine to name a few. Continue reading “Computational scientist”
Computational engineering
Computational science and engineering (CSE) is a relatively new discipline that deals with the development and application of computational models and simulations, often coupled with high-performance computing, to solve complex physical problems arising in engineering analysis and design (computational engineering) as well as natural phenomena (computational science). CSE has been described as the “third mode of discovery” (next to theory and experimentation).[1] In many fields, computer simulation is integral and therefore essential to business and research. Computer simulation provides the capability to enter fields that are either inaccessible to traditional experimentation or where carrying out traditional empirical inquiries is prohibitively expensive. CSE should neither be confused with pure computer science, nor with computer engineering, although a wide domain in the former is used in CSE (e.g., certain algorithms, data structures, parallel programming, high performance computing) and some problems in the latter can be modeled and solved with CSE methods (as an application area). Continue reading “Computational engineering”
Computational science
Computational science (also scientific computing or scientific computation ( SC )) is a rapidly growing multidisciplinary field that uses advanced computing capabilities to understand and solve complex problems. It is an area of science which spans many disciplines, but at its core it involves the development of models and simulations to understand natural systems. Continue reading “Computational science”
Computational physics
Computational physics is the study and implementation of numerical analysis to solve problems in physics for which a quantitative theory already exists. [1] Historically, computational physics was the first application of modern computers in science, and is now a subset of computational science . Continue reading “Computational physics”
Computational phylogenetics
Computational phylogenetics is the application of computational algorithms , methods, and programs to phylogenetic analyzes. The goal is to assemble a phylogenetic tree representing the evolutionary ancestry of a set of genes , species , or other taxa . For example, these techniques have been used to explore the family tree of hominid species [1] and the relationship between specific types of organisms. [2] Traditional phylogenetics related to morphological data obtained by measuring and quantifying the phenotypicmolecular nucleotide sequences encoding genes or amino acid sequences encoding proteins as the basis for classification. Numerous forms of molecular phylogenetics are closely related to and make extensive use of sequence alignment and refining phylogenetic trees, which are used to classify the evolutionary relationships between homologous genes represented in the genome of divergent species. The phylogenetic trees made by the computational methods are perfectly reproduce the evolutionary treethat represents the historical relationship between the species being analyzed. The historical species may also differ from the historical tree of an individual homologous gene. Continue reading “Computational phylogenetics”
Computational photography
Computational photography or computational imaging refers to digital image capture and processing techniques that use digital computation instead of optical processes. Computational photography can improve the capabilities of a camera, or introduce features that are not possible at all with film based photography. Examples of computational photography include in-camera computation of digital panoramas , [6] high-dynamic-range images , and light field cameras . Light field cameras, 3D image, enhanced depth-of-field, and selective de-focusing (or “post focus”). Enhanced depth-of-field reduces the need for mechanical focusing systems. All of these features use computational imaging techniques. Continue reading “Computational photography”
Computational particle physics
Computational particle physics refers to the methods and computing tools developed by particle physics research. Like computational chemistry or computational biology , it is, for particle physics both a specific branch and an interdisciplinary field relying on computer science, theoretical and experimental particle physics and mathematics. The main fields of computational particle physics are: lattice field theory (numerical computations), automatic calculation of particle interaction or decay (computer algebra), and event generators (stochastic methods). Continue reading “Computational particle physics”
Computational number theory
In mathematics and computer science , computational number theory , also known as algorithmic number theory , is the study of algorithms for performing numerical computations . Continue reading “Computational number theory”
Computational neuroscience
Computational Neuroscience (also theoretical neuroscience ) studies brain function in terms of the information processing properties of the structures That Make up the nervous system . [1] [2] It is an interdisciplinary computational science that links the various fields of neuroscience , cognitive science , and psychology with electrical engineering , computer science , mathematics , and physics . Continue reading “Computational neuroscience”
Computational neurogenetic modeling
Computational neurogenetic modeling (CNGM) is concerned with the study and development of dynamic neuronal models for modeling brain functions with respect to genes and dynamic interactions between genes. These include neural network models and their integration with gene network models. This area Brings together knowledge from various scientific disciplines, Such As computer and information science , neuroscience and cognitive science , genetics and molecular biology , as well as engineering . Continue reading “Computational neurogenetic modeling”
Computational musicology
Computational musicology is defined as the study of music with computational modeling and simulation. [1] It was started in the 1950s and originally did not use computers, but more of statistical and mathematical methods. Nowadays computational musicology depends largely on complex algorithms . Computer science, computer music, systematic musicology, music information retrieval, computational musicology, digital musicology, sound and music computing and music informatics. [2] Continue reading “Computational musicology”
Computational mechanics
Computational Mechanics is the discipline concerned with the use of computational methods to study phenomena governed by the principles of mechanics . Before the emergence of computational science (also called scientific computing) as a “third way” besides theoretical and experimental sciences, computational mechanics was widely considered to be a sub-discipline of applied mechanics . It is now considered to be a sub-discipline within computational science. Continue reading “Computational mechanics”
Computational Materials Science
Computational Materials Science is a monthly peer-reviewed scientific journal published by Elsevier . It was established in October 1992. The editors-in-chief are H. Dreysse and S. Schmauder. The journal covers computational modeling and practical research for advanced materials and their applications. [2] Continue reading “Computational Materials Science”
Computational magnetohydrodynamics
Computational magnetohydrodynamics (CMHD) is a rapidly developing branch of magnetohydrodynamics that uses numerical methods and algorithms to solve problems that involve electrically conducting fluids. Most of the methods used in CMHD are used in computational fluid dynamics . The complexity is arises from the presence of a magnetic field and its coupling with the fluid. One of the important issues is to numerically maintain the\ displaystyle \ nabla \ cdot (conservation of magnetic flux ) condition, from Maxwell’s equations , to avoid any unphysical effects. Continue reading “Computational magnetohydrodynamics”
Computational logic
Computational logic is the use of logic to perform or reason about computation . It bears a similar relationship to science science and engineering as mathematical logic bears to mathematics and as philosophical logic bears to philosophy . It is synonymous with ” logic in computer science “. Continue reading “Computational logic”
Computational lithography
Computational lithography (also known as computational scaling ) is the set of mathematical and algorithmic approaches designed to improve the resolution achievable through photolithography . Computational lithography has come to the forefront of photolithography in 2008 as the semiconductor industry grappled with the challenges associated with the transition to 22 nanometer CMOS process technology and beyond. Continue reading “Computational lithography”
Computational linguistics
Computational linguistics is an interdisciplinary field concerned with the statistical or rule-based modeling of natural language from a computational perspective, as well as the study of appropriate computational approaches to linguistic issues. Continue reading “Computational linguistics”
Computational lexicology
Computational lexicology is a branch of computational linguistics , which is concerned with the use of computers in the study of lexicon . It has been more narrowly described by some scholars (Amsler, 1980) as the use of computers in the study of machine-readable dictionaries . It is distinguished from computational lexicography , which more properly would be the use of computers in the construction of dictionaries, but some researchers have used computational lexicography as synonymous . Continue reading “Computational lexicology”
Computational learning theory
In computer science , computational learning theory (or just learning theory ) is a subfield of Artificial Intelligence Devoted to studying the design and analysis of machine learningalgorithms. [1] Continue reading “Computational learning theory”
Computational law
Computational law is a branch of legal informatics concerned with the mechanization of legal reasoning (whether done by humans or by computers). [1] It emphasizes explicit behavioral constraints and eschews implicit rules of conduct. Importantly, there is a commitment to a level of rigor in specifying laws that is sufficient to support entirely mechanical processing. Continue reading “Computational law”
Computational journalism
Computational Journalism can be defined as the application of information gathering, organization, sensemaking, communication and dissemination of news information, while upholding values of journalism and accuracy and verifiability. [1] The field draws on technical aspects of computer science including artificial intelligence, content analysis (NLP, vision, hearing), visualization, personalization and recommender systems, and aspects of social computing and information science . Continue reading “Computational journalism”
Computational immunology
In academia , computational immunology is a field of science that encompasses high-throughput genomic and bioinformatics approaches to immunology . The field’s main aim is to convert data into computational immunological problems, solve problems thesis using mathematical and computational approaches And Then thesis convert results into immunologically Meaningful interpretations. Continue reading “Computational immunology”
Computational humor
Computational humor is a branch of computational linguistics and artificial intelligence which uses computers in humor research . It is a relatively new area, with the first dedicated conference organized in 1996. [1] Continue reading “Computational humor”
Computational group theory
In mathematics , computational group theory is the study of groups by means of computers. It is concerned with designing and analyzing algorithms and data structures to compute information about groups. The subject HAS Attracted interest Because For Many interesting groups (Including MOST of the sporadic groups ) it is impractical to perform calculations by hand. Continue reading “Computational group theory”
Computational geophysics
Computational geophysics entails rapid numerical computations that help analyzes of geophysical data and observations. High-performance computing is involved, due to the size and complexity of the geophysical data to be processed. The main computing requirements are 3D and 4D images of the sub-surface earth , Modeling and Migration of complex media, Tomography and inverse problems . Continue reading “Computational geophysics”
Computational geometry
Computational geometry is a branch of computer science devoted to the study of algorithms which can be stated in terms of geometry . Some purely geometrical problems arise from the study of computational geometric algorithms , and such problems are also considered to be part of computational geometry. While modern computational geometry is a recent development, it is one of the oldest fields of computation with history stretching back to antiquity. Continue reading “Computational geometry”
Computational genomics
Computational genomics (1) refers to the use of computational and statistical analysis to decipher biology from genome sequences and related data, [1] including both DNA and RNA sequences as well as other “post-genomic” data (ie DNA microarrays, which requires the genome sequence . These fields are also often referred to as Computational and Statistical Genetics/ genomics. As such, computational genomics can be considered as a subset of bioinformatics and computational biology , but with a focus on whole genomes (rather than individual genes) to understand the principles of the DNA of a species of biology and molecular biology. beyond. With the current abundance of massive biological datasets, computational studies have become one of the most important means to biological discovery. [2] Continue reading “Computational genomics”
Computational finance
Computational Finance is a branch of Applied Science that deals with problems of practical interest in finance . [1] Somewhat different definitions are the study of data and algorithms currently used in finance [2] and the mathematics of computer programs that realize financial models or systems . [3] Continue reading “Computational finance”
Computational epistemology
Computational epistemology is a subdiscipline of formal epistemology that studies the intrinsic complexity of inductive problems for ideal and computationally bounded agents. In short, computational epistemology is to inducewhat recursion theory is to deduction . Continue reading “Computational epistemology”
Computational epigenetics
Computational epigenetics [1] [2] [ unreliable source? ] uses bioinformatic methods [ clarification needed ] to complement experimental research in epigenetics . Due to the recent explosion of epigenome datasets, computational methods play an increasing role in all areas of epigenetic research. Continue reading “Computational epigenetics”
Computational economics
Computational economics is a research discipline at the interface of computer science, economics, and management science.[1] This subject encompasses computational modelingof economic systems, whether agent-based,[2] general-equilibrium,[3] macroeconomic,[4] or rational-expectations,[5] computational econometrics and statistics,[6] computational finance, computational tools for the design of automated internet markets, programming tools specifically designed for computational economics, and pedagogical tools for the teaching of computational economics. Some of these areas are unique to computational economics, while others extend traditional areas of economics by solving problems that are difficult to study without the use of computers and associated numerical methods.[7] Continue reading “Computational economics”
Computational criminology
Computational criminology is an interdisciplinary field that uses computer science methods to formally define criminology concepts, improve our understanding of complex phenomena, and generate solutions for related problems. Continue reading “Computational criminology”
Computational creativity
Computational creativity (Also Known As artificial creativity , mechanical creativity , creative computing or creative computing ) is a Multidisciplinary endeavor That Is site location is the intersection of the fields of artificial intelligence , cognitive psychology , philosophy , and the arts . Continue reading “Computational creativity”
Computational complexity theory
Computational complexity theory is a branch of the theory of computation in theoretical computer science That Focuses we Classifying computational problems selon Their inherent difficulty, and Relating Those classes to Each Other. A computational problem is understood to be a task which is in principle amenable to being solved by a computer, which is equivalent to stating that the problem may be solved by mechanical application of mathematical steps, such as an algorithm . Continue reading “Computational complexity theory”
Computational cognition
Computational cognition (sometimes referred to as computational cognition science ) is the study of the computational basis of learning and inference by mathematical modeling , computer simulation , and behavioralexperiments. In psychology, it is an approach which develops computational models based on experimental results. It seeks to understand the basis of the human method of processing information . Early on computational cognitive scientists sought to bring back and create a scientific form of Brentano’s psychology [1] Continue reading “Computational cognition”
Computational chemistry
Computational chemistry is a branch of chemistry that uses computer simulation to assist in solving chemical problems. It uses methods of theoretical chemistry , incorporated into effective computer programs , to calculate the structures and properties of molecules and solids. It is necessary because of the recent relative results concerning the hydrogen molecular ion (the dihydrogen cation , see references therein for more details), the quantum many-body problem can not be solved analytically, much less in closed form. While computational results Normally complement the information Obtained by chemical experimentspredictable hitherto unobserved chemical phenomena . It is widely used in the design of new drugs and materials. Continue reading “Computational chemistry”
Computational biology
Computational biology involves the development and application of data-analytical and theoretical methods, mathematical modeling and computational simulation techniques to the study of biological, behavioral, and social systems. [1] The field is broadly defined and includes foundations in computer science , applied mathematics , animation , statistics , biochemistry , chemistry , biophysics , molecular biology , genetics , genomics , ecology , evolution , anatomy ,neuroscience , and visualization . [2] Continue reading “Computational biology”
Computational auditory scene analysis
Computational auditory scene analysis (CASA) is the study of auditory scene analysis by computational means. [1] In essence, CASA systems are “machine listening” systems that are likely to have separate sources of sound sources. CASA differs from the field of blind signal separation fait que it is (at least to Some extent) based on the Mechanisms of the human auditory system , and THUS uses no more than two microphone recordings of an acoustic environment. It is related to the cocktail party problem .
Computational astrophysics
Computational astrophysics refers to the methods and computing tools developed and used in astrophysics research. Like computational chemistry or computational physics , it is a specific branch of theoretical astrophysicsand an interdisciplinary field relying on computer science , mathematics , and wider physics . Computational astrophysics is most often studied through an applied mathematics or astrophysics program at PhD level. Continue reading “Computational astrophysics”
Computational archeology
Computational archeology describes computer-based analytical methods for the study of long-term human behavior and behavioral evolution. As with other sub-disciplines that have prefixed ‘computational’ to their name (eg, computational biology , computational physics and computational sociology ), the term is reserved for (all mathematical) methods that could not be realistically performed without the aid of a computer. Continue reading “Computational archeology”
Computational aeroacoustics
Computational aeroacoustics is a branch of aeroacoustics that aims to analyze the generation of noise by turbulent flows through numerical methods. Continue reading “Computational aeroacoustics”
Computable topology
Computable topology is a discipline in mathematics that studies the topological and algebraic structure of computation . Computable topology is computational topology , which studies the application of computation to topology. Continue reading “Computable topology”
Community informatics
Community informatics (CI) is an interdisciplinary field that is concerned with using information and communication technology (ICT) to empower members of communities and supports their social, cultural, and economic development. [1] [2] Community informatics may contribute to enhancing democracy, supporting the development of social capital, and building well connected communities; moreover, it is probable that such similar actions may be new positive social change . [2] In community informatics, there are several considerations which are the social contexts, shared values, distinct processes that are taken by members of a community, and social and technical systems.[2] It is formally located as an academic discipline within a variety of academic faculties including information science , information systems , computer science , planning , development studies , and library science among others and draws on insights on community development from a range of backgrounds. disciplines. It is an interdisciplinary approach interested in using ICTs for different forms of community action, as distinct from ICT effects. [1] [3] Continue reading “Community informatics”
Cheminformatics
Pathformatics (also known as chemoinformatics , chemoinformatics and chemical informatics ) is the use of computer and informational techniques applied to a range of problems in the field of chemistry . These in silicotechniques are used, for example, in pharmaceutical companies in the process of drug discovery . These methods can also be used in chemical and allied industries in various other forms. Continue reading “Cheminformatics”
Biological computation
The term “biological computation” refers, variously, to any of the following:
– the study of the computations performed by natural biota , [1] [2] [3] [4] including the subject matter of systems biology . Continue reading “Biological computation”
Biodiversity informatics
Biodiversity Informatics is the application of informatics techniques for biodiversity information for management, presentation, discovery, exploration and analysis. It typically builds on a foundation of taxonomic , biogeographic , or ecological information stored in digital form, which, with the application of modern computer techniques, can yield new ways to view and analyze existing information. yet exist (see niche modeling). Biodiversity Informatics is a Relatively young discipline (the term coined Was in gold around 1992) HAS goal Hundreds of Practitioners worldwide, Including the Numerous Individuals Involved with the design and building of taxonomic databases . The term “Biodiversity Informatics” is generally used in the broad sense of the term; The term ” bioinformatics ” is often used synonymously with the computerized handling of data in the specialized area of molecular biology .
Author profiling
Author profiling is a method of analyzing a number of texts and texts of the author (eg age and gender) based on stylistic and content-based features. Continue reading “Author profiling”
Astroinformatics
Astroinformatics is an interdisciplinary field of study involving the combination of astronomy , data science , informatics , and information / communications technologies. [1] [2] Continue reading “Astroinformatics”
Algorithmic art
Algorithmic art , also known as art algorithm , is art, mostly visual art , of which the design is generated by an algorithm . Algorithmic artists are sometimes called algorists . Continue reading “Algorithmic art”
Agent-based computational economics
Agent-based computational economics ( ACE ) is the area of computational economics that studies economic processes, including all economies , as dynamic systems of interacting agents . As such, it falls in the paradigm of complex adaptive systems . [1] Corresponding agent-based models , the ” agents ” are “computational objects modeled as interacting according to rules” over space and time, not real people. The rules are formulated to model behavior and social interactions based on incentives and information. [2]Such rules could also be the result of optimization, made using AI methods (such as Q-learning and other reinforcement learning techniques). [3] Continue reading “Agent-based computational economics”
Xgboost
XGBoost [1] is an open-source software library that provides the gradient boosting framework for C ++ , Java , Python , [2] R , [3] and Julia . [4] It works on Linux , Windows , [5] and macOS . [6] From the project description, it aims to provide a “Scalable, Portable and Distributed Gradient Boosting (GBM, GBRT, GBDT) Library”. Other than running on a single machine, it also supports the distributed processing frameworksApache Hadoop , Spark Apache , and Apache Flink . It has gained much popularity and attention recently as it was the algorithm of choice for many winning teams of machine learning competitions. [7] Continue reading “Xgboost”
Virtuoso Universal Server
Virtuoso Universal Server is a middleware and database engine that combines the functionality of a traditional Relational Database Management System (RDBMS), object-relational database (ORDBMS), virtual database , RDF , XML , free-text , web application server and file server functionality in a single system. Virtuoso is a “universal server”; it allows a single multithreaded server processthat implements multiple protocols. The open source edition of Virtuoso Universal Server is also known as OpenLink Virtuoso . The software has-been developed by OpenLink Software with Kingsley Uyi Idehen and Orri Erling as the chief software architects . Continue reading “Virtuoso Universal Server”
SQream DB
SQream DB is a relational database management system (RDBMS) that uses graphics processing units (GPUs) from Nvidia . SQream DB is designed for big data analytics using the Structured Query Language (SQL). [2] Continue reading “SQream DB”
Apache Spark
Apache Spark is an open-source cluster-computing framework . Originally Developed at the University of California, Berkeley ‘s AMPLab , the Spark codebase Was later Donated to the Apache Software Foundation , qui HAS maintained it since. Spark provides an interface for full programming with implicit data parallelism and fault tolerance . Continue reading “Apache Spark”
SAP HANA
SAP HANA is an in-memory , column-oriented , relational database management system developed and marketed by SAP SE . [1] [2] Its primary function is a database server and is retrieved as requested by the applications. In addition, it Performs advanced analytics ( predictive analytics , spatial data processing , text analytics, text search, streaming analytics , graph data processing ) and includes ETL capabilities as well as an Application Server . Continue reading “SAP HANA”
Qizx
Qizx is a proprietary XML database that provides native storage for XML data.
Qizx was first developed by Xavier Franc of Axyana [1] and was purchased by Qualcomm in 2013. [2] Qizx was re-released by Qualcomm in late 2014 on Amazon Web Services . [3] Continue reading “Qizx”
Oracle NoSQL Database
Oracle NoSQL Database is a NoSQL -type distributed key-value database from Oracle Corporation . [1] [2] [3] [4] It provides transactional semantics for data manipulation, horizontal scalability, and simple administration and monitoring. Continue reading “Oracle NoSQL Database”
Oracle Big Data Appliance
The Oracle Big Data Appliance consists of hardware and software from Oracle Corporation sold as a computer appliance . It was announced in 2011, promoted for consolidation and loading unstructured data into Oracle Database software. Continue reading “Oracle Big Data Appliance”
NoSQLz
NoSQLz is a consistent key-value for the large data store ( NoSQL database) for z / OS IBM systems. [1] It was developed by Thierry Falissard in 2013. The purpose is to provide a low-cost alternative to all proprietary mainframe DBMS (version 1 is free software ). Continue reading “NoSQLz”
MonetDB
MonetDB is an open source column-oriented database management system developed at the Wiskunde Centrum & Informatica (CWI) in the Netherlands . It was designed to provide high performance on complex queries against large databases, such as combining tables with millions of rows and millions of rows. MonetDB has been applied in high-performance applications for online analytical processing , data mining , geographic information system (GIS), [1] Resource Description Framework (RDF), [2] text retrieval and sequence alignmentprocessing.[3] Continue reading “MonetDB”
Predix (software)
Predix is General Electric’s software platform for the collection and analysis of data from industrial machines. [1] General Electric plans to support the growing industrial Internet of things with cloud servers and an app store . [2] GE is a member of the Industrial Internet Consortium, which works with the development and use of industrial internet technologies. [3] Continue reading “Predix (software)”
Draft: MindSphere
MindSphere is an open cloud platform or “IoT operating system” [1] developed by Siemens for applications in the context of the Internet of Things ( IoT ). [2] MindSphere stores operational data and makes it accessible through digital applications (“MindApps”) to enable industrial customers to make decisions based on valuable factual information. [3] The system is used in such applications as automated production and vehicle fleet management. [2] [4] Continue reading “Draft: MindSphere”
Hue (Hadoop)
Hue (Hadoop User Experience) is an open-source Web interface that supports Apache Hadoop and its ecosystem, licensed under the Apache v2 license. [1] Continue reading “Hue (Hadoop)”
Hibari (database)
Hibari is highly consistent, highly available, distributed, key-value Big Data store. ( NoSQL database) [1] It was developed by Cloudian, Inc. , formerly Gemini Mobile Technologies to support its mobile messaging and email services and released as open source on July 27, 2010. Continue reading “Hibari (database)”
Apache Hadoop
Apache Hadoop ( / h ə d u p / ) is an open source software framework used for distributed storage and processing of dataset of big data using the MapReduce programming model . It consists of computer clusters built from commodity hardware . All the modules in Hadoop are designed with a fundamental assumption that hardware failures are common occurrences and should be handled by the framework. [2] Continue reading “Apache Hadoop”
H2O (software)
H2O is open source software for big-data analysis . It is produced by the company H2O.ai (formerly 0xdata ), which launched in 2011 in Silicon Valley . H2O allows users to make thousands of potential models as part of discovering patterns in data. Continue reading “H2O (software)”
Apache Cassandra
Apache Cassandra is a free and open-source distributed NoSQL database management system designed to handle large amounts of data across many commodity servers , providing high availability with no single point of failure . Cassandra offers robust support for multiple spanning datacenter clusters , [1] with asynchronous masterless replication allowing low latency operations for all clients. Continue reading “Apache Cassandra”
Apache SystemML
Apache SystemML is a flexible machine learning system that automatically scales to Spark and Hadoop clusters. SystemML’s distinguishing characteristics are:
-
- Algorithm customizability via R-like and Python-like languages.
- Multiple execution modes, including Standalone, Spark Batch, Spark MLContext, Hadoop Batch, and JMLC.
- Automatic optimization based on data and cluster characteristics to ensure both efficiency and scalability.
Apache Mahout
Apache Mahout is a project of the Apache Software Foundation to Produce free implementations of distributed gold Otherwise scalable machine learning algorithms Focused Primarily in the areas of collaborative filtering , clustering and classification. Many of the implementations use the Apache Hadoop platform. [2] [3] Mahout also provides Java libraries for common math operations and Java primitive collections. Mahout is a work in progress; the number of implemented algorithms has grown quickly, [4] but various algorithms are still missing. Continue reading “Apache Mahout”
Apache Beam
Apache Beam is an open source unified programming model to define and execute data processing pipelines , including ETL , batch and stream (continuous) processing. [1] Beam Pipelines are defined by one of the provided SDKs and executed in one of the Beam’s supported runners (distributed processing back-ends) including Apache Apex , Apache Flink , Apache Spark , and Google Cloud Dataflow [2] Continue reading “Apache Beam”
Smart, connected products
Smart, connected products are products, assets and other things embedded with processors, sensors, software and connectivity that allow to be exchanged between the product and its environment, manufacturer, operator / user, and other products and systems. Connectivity also enables some capabilities of the product to exist outside the physical device, which is known as the cloud product. The data collected from these products can be analyzed to inform decision-making, to enable operational efficiencies and to continuously improve the performance of the product. Continue reading “Smart, connected products”
Flutura Decision Sciences and Analytics
Flutura Decision Sciences and Analytics is an industrial Internet of things (IoT) company that focuses on machine to machine and big data analytics serving customers from manufacturing, energy and engineering industries. Its main offices are located in Palo Alto, California and has its development center in Bengaluru, India. Continue reading “Flutura Decision Sciences and Analytics”
Carriots
Carriots is an Application hosting and development platform ( Platform as a Service ) specially designed for projects related to the Internet of Things (IoT) and Machine to Machine (M2M). Enables data collection of goods objects (the things part), store it, builds powerful applications with few lines of code and integration IT systems (the internet part). Carriots provides a development environment, APIs and hosting for IoT projects development. Continue reading “Carriots”
Machine to machine
Machine to machine refers to direct communication between devices using any communications channel , including wired and wireless. [1] [2] Machine to machine communication can include industrial instrumentation, enabling a sensor or meter to communicate the data it records (such as temperature, inventory level, etc.) to application software that can use it (for example, adjusting an industrial process based on temperature or placing orders to replenish inventory). [3] Such communication was successful by having a remote network of machines relay information back to a central hub for analysis, which would then be rerouted into a system like a personal computer. [4] Continue reading “Machine to machine”
ZoomData
Zoomdata is a Reston , Virginia -based data visualization and analytics company founded in 2012. [1] Continue reading “ZoomData”
Zaloni
Zaloni, Inc. is privately owned, software, and services company headquartered in Durham, North Carolina . Zaloni provides data management software and solutions for big data scale-out architectures, such as Apache Hadoop , Amazon S3 . The company focuses on management of data lakes with 2 products: Bedrock and Mica. [2] Continue reading “Zaloni”
VoloMetrix
VoloMetrix, Inc. is an American subsidiary of Microsoft based in Seattle, Washington . VoloMetrix sells people analytics software that combines data from collaboration platforms to create data visualizations and dashboards . At the end of April 2013, the company raised $ 3.3M in series A funding from Shasta Ventures . [2] In October 2014, VoloMetrix announced a round of funding with Shasta Ventures and Split Rock Partners that raised $ 12M. [3] In September 2015, Microsoft announced that they had acquiredthe company, but did not disclose the amount. The acquisition was made to improve existing Microsoft offers Microsoft Office 365 and Microsoft Delve . [4] Continue reading “VoloMetrix”
TubeMogul
TubeMogul is an enterprise software company for brand advertising. [3]
TubeMogul is headquartered in Emeryville, California and has global offices located in Chengdu (China), Chicago, Detroit, Kiev, New York, London, Los Angeles, Minneapolis, Paris, Sao Paulo, Singapore, Shanghai, Sydney, Toronto, and Tokyo. [2] Continue reading “TubeMogul”
ThetaRay
Hod HaSharon , Israel, and offices in New York and Singapore, ThetaRay is a cyber security and big data analytics company . The company provides a platform for the detection of unknown threat and risks to protect critical infrastructure [1] and financial services. The platform is also used to uncover unknown opportunities based on big data. [2] The company uses patented mathematical algorithms developed by the company founders. [3] Continue reading “ThetaRay”
Teradata
Teradata Corporation is a provider of database- products and services. The company was formed in 1979 in Brentwood, California , as a collaboration between researchers at Caltech and Citibank’s advanced technology group. [2] The company was acquired by NCR Corporation in 1991, and subsequently spun-off as an independent public company on October 1, 2007. Continue reading “Teradata”
Talend
Talend ( Pronunciation: TAL-end ) is a software integration vendor. The company provides big data , cloud storage , data integration , data management , master data management , data quality , data preparation and enterprise application integration software and services. [1] The company is headquartered in Redwood City, California . [2] Continue reading “Talend”
Sumo Logic
Sumo Logic is a cloud-based log management and analytics service that leverages machine-generated data to deliver real-time IT insights. [1] Headquartered in Redwood City , California , Sumo Logic was founded in April 2010 by ArcSight veterans Kumar Saurabh and Christian Beedgen, and Accident Partners , DFJ Growth , Greylock Partners , Institutional Venture Partners , Sequoia Capital , Sutter Hill Venturesand angel investor Shlomo Kramer . [2] While Sumo Logic Remained in stealth fashion for two years, it unveiled icts cloud-based log management platform with Series B funding of $ 15 million in January 2012. [1] The round of Series E funding annoncé in June 2015 Brings the company’s total venture capital backing to $ 160.5 million. [3] On June 27 the company closed its Series F round for $ 75 million and is on path to IPO [4] . As of June 2017, the company has collected VC funding totaling $ 230 million. Continue reading “Sumo Logic”
Sojern
Sojern is a provider of a data-driven traveler that uses programmatic buying and learning technology. [1] [2] Sojerns, OTAs , OTAs , to collect anonymized (non-personally identifiable) travelers based on these sites. [2] [3] The company uses this data to target travelers and deliver advertising across a number of media channels. [1] [3] Sojern is currently headquartered in San Francisco, with key offices in New York, Omaha, Dubai, Singapore, London and Dublin.[1] [4] Continue reading “Sojern”
Sense Networks
Sense Networks is a New York City based company with a focus on applications that analyze data from mobile phones , carrier networks , and taxicabs , particularly by using machine learning technology to make large data of location (latitude / longitude) data. [1] [2] [3] [4] Continue reading “Sense Networks”
Semantic Research
Semantic Research, Inc. is a privately held software company headquartered in San Diego, California with flagship offices in Washington, DC and Tampa, FL . Semantic Research (not to be confused with Symantec ), is a California C-corporation that offers patented, graph-based knowledge discovery, analysis and visualization software technology. [1] [2] Its most popular product is a link analysis software application called SEMANTICA Pro. Continue reading “Semantic Research”
SalesforceIQ
SalesforceIQ (formerly RelateIQ), a subsidiary of Salesforce.com , is an American enterprise software company based in Palo Alto, California . The company’s software is a relationship intelligence platform that combines data from email systems, smartphone calls, and enhancements to augment or replace standard relationship management tools or database solutions. It scans “about 10,000 emails, calendar entries, and other data points per minute at first run”. [1] Continue reading “SalesforceIQ”
Rocket U2
Rocket U2 is a suite of database management (DBMS) and supporting software now owned by Rocket Software . It includes two MultiValue database platforms: UniData and UniVerse . [1] Both of These products are operating environments qui current run is Unix , Linux and Windows operating systems . [2] [3] They are both derivatives of the Pick operating system . [4]The family also includes developerand web-enabling technologies including SystemBuilder / SB + , SB / XA , U2 Web Development Environment (WebDE), UniObjects and wIntegrate . [1] Continue reading “Rocket U2”
Rocket Fuel Inc.
Rocket Fuel is an ad technology company based in Redwood City , California. [3] It was founded in 2008 by alumni of Yahoo! . [3] Continue reading “Rocket Fuel Inc.”
Quid Inc.
Quid, Inc. is a private software and services company, specializing in text-based data analysis. Quid software can read millions of documents (eg news articles, blog posts, company profiles, and patents) and offers insight by organizing that content visually. [2] Continue reading “Quid Inc.”
Quertle
Quertle is a biomedical and life science big data analytics company specializing in knowledge discovery and literature searching. Continue reading “Quertle”
Qloo
Qloo (pronounced “clue”) is a company that uses artificial intelligence (AI). An application programming interface (API) provides cultural correlations. [1] It was founded by Alex Eliasand received funding from Leonardo DiCaprio , Barry Sternlicht and Pierre Lagrange . Continue reading “Qloo”
Premise (company)
Premise is an American data company that tracks alternative economic indicators, such as local produce prices, and aggregates insights on consumption and inflation to governments and financial institutions. [1] [2] [3] [4] [5] Co-founders David Soloff and Joe Reisinger previously cam from MetaMarkets, an online advertising analytics company co-founded by Soloff. [6] Continue reading “Premise (company)”
Platfora
Platfora, Inc. is a big data analytics company based in San Mateo, California . The firm’s software works with the open-source software Apache framework Hadoop to assist with data analysis, data visualization , and sharing. [2] [3] [4] Continue reading “Platfora”
Palantir Technologies
Palantir Technologies is a private American software and services company which specializes in big data analysis . Headquartered in Palo Alto, California , Palantir Gotham and Palantir Metropolis. Palantir Gotham is used by counter-terrorism analysts at offices in the United States Intelligence Community (USIC) and United States Department of Defense , fraud investigators at the Recovery Accountability and Transparency Board , and cyber analysts at Information Warfare Monitor, while Palantir Metropolis is used by hedge funds, banks, and financial services firms. [3] [4] Continue reading “Palantir Technologies”
Ninja Metrics
Ninja Metrics, Inc. is a Social analytics and data based company based in Manhattan Beach, California . Its primary service measures social influence and provides predictive analytics for web and mobile applications . Continue reading “Ninja Metrics”
Medopad
Medopad Ltd is a British healthcare technology company based in London, UK. It also has offices in Singapore and Munich. It produces applications that integrate data from existing hospital databases and other mobile devices and securely transmits it for use by doctors. [1] [2] Continue reading “Medopad”
Medio
Medio is a business-to-business mobile analytics provider based in Seattle , WA. The company processes pre-existing data [2] to provide historic and predictive analytics . Medio is built on a cloud-based [3] Hadoop platform and is designed to interpret big data for mobile enterprise. Medio has had various partners including: IBM , Rovio , [4] Verizon , T-Mobile , [5]ABC , and Disney [6] Continue reading “Medio”
User: Maxhercask / sandbox
Cask Data , dba ‘Cask’, is a privately held information technology company, established in 2011, with its headquarters located in Palo Alto, California . It provides software and services that enable broad, data-intensive enterprises – such as Thomson Reuters [1] – and many other diverse clients to accelerate their ability to extract value from their big data investments. Continue reading “User: Maxhercask / sandbox”
MarkLogic
MarkLogic Corporation is an American software business that develops and provides an enterprise NoSQL database, also named MarkLogic . The company was founded in 2001 and is based in San Carlos , California . MarkLogic is privately held with over 500 employees and has offices throughout the United States , Europe , Asia , and Australia . Continue reading “MarkLogic”
MapR
MapR Technologies, Inc. is an enterprise software company headquartered in Santa Clara, California . MapR overall Provides access to a wide variety of data sources from a single cluster, Including big data workloads Such As Apache Hadoop and Apache Spark , a distributed file system, a multi-model database management system , and event streaming. Combining analytics in real-time with operational applications, its technology runs on both commodity hardware and public cloud computing services. Continue reading “MapR”
Kinetica (software)
Kinetica DB, Inc. is a company that has a distributed, in-memory database management system using graphics processing units (GPUs). The software it markets is also called Kinetica. The company has headquarters in Arlington, Virginia and San Francisco . Continue reading “Kinetica (software)”
Imply Corporation
Imply is a computer software company founded by the creators of Druid , which aims to help organizations with exploratory data analysis using Druid. [1] Continue reading “Imply Corporation”
HPCC Systems
HPCC Systems (High Performance Computing Cluster) is part of the LexisNexis Risk Solutions and is HPCC big data software. In June 2011, it offers the software of an open source dual license model. [1] [2] [3] [4] Continue reading “HPCC Systems”
Hortonworks
Hortonworks is a big data software company based in Santa Clara, California . The company develops and supports Apache Hadoop for distributed data processing across computer clusters . Continue reading “Hortonworks”
Hack / reduce
hack / reduce is a 501 (c) (3) non-profit created to grow a community of big data experts in the Boston area. [1] It is located in the historic Kendall Boiler and Tank Company building in Kendall Square in Cambridge, Massachusetts . Continue reading “Hack / reduce”
User: Guppywon / Alluxio
Alluxio
Alluxio is a venture-backed enterprise software company developed around the open source project of the same name. Alluxio’s technology was developed in a doctoral thesis at the University of Berkeley AMPLab, with grant funding from DARPA . Continue reading “User: Guppywon / Alluxio”
Groundhog Technologies
Groundhog Technologies is a privately held company founded in 2001 and is headquartered in Cambridge, Massachusetts, USA. As a spin-off of MIT Media Lab , [1] [2] it was a semi-finalist in MIT’s $ 50k Entrepreneurship Competition in 2000 and was incorporated the following year. [3] [4] The company received the first round of financing from major Japanese corporations and Their venture capital arms in November 2002, Marubeni , Yasuda Enterprise Development and Japan Asia Investment Co. [5] [6] It received second round of financing in 2004 and since then has become self-sustainable. [7] Continue reading “Groundhog Technologies”
GridGain Systems
GridGain Systems is a privately held information technology company, established in 2007, with its headquarters located in Foster City, California . It provides software and services for large data systems by utilizing in-memory computing to increase data throughput and minimize latency . Continue reading “GridGain Systems”
Greenplum
Greenplum was a big data analytics company headquartered in San Mateo , California . Greenplum was acquired by EMC Corporation in July 2010. [1] Starting in 2012 its database management system software became known as the Pivotal Greenplum Database sold through Pivotal Software and is currently being developed by the Greenplum Database and open source community Pivotal. Continue reading “Greenplum”
Flytxt
Flytxt BV is a customer data analytics software product company. [4] The company has its headquarters in Amsterdam , Netherlands offices in Dubai and India and regional presence in Paris , London , Singapore , Nairobi , and Mexico City . Continue reading “Flytxt”
Fluentd
Fluentd is a cross platform open source data collection software project originally developed at Treasure Data. It is written primarily in the Ruby programming language. Continue reading “Fluentd”
Dataiku
Dataiku is a computer software company headquartered in New York City . The company develops collaborative data science software marketed for big data . Continue reading “Dataiku”
Dataiku
Dataiku is a computer software company headquartered in New York City . The company develops collaborative data science software marketed for big data . Continue reading “Dataiku”
Databricks
Databricks is a company founded by the creators of Apache Spark , [1] which aims to help customers with cloud-based big data processing using Spark. [2] [3] Databricks grew out of the AMPLab project at UC Berkeley, which was involved in making Apache Spark , a distributed computing framework built atop Scala . Databricks Develops a web-based platform for working with Spark That Provides automated cluster management and IPython -style notebooks . In addition to building the Databricks platform, the company is co-organizing massive open online coursesabout Spark [4] Spark – Spark Summit. Continue reading “Databricks”
cVidya
cVidya Networks is a provider of big data analytics for communications and digital service providers . cVidya’s market includes business protection and business growth, including revenue insurance , fraud management, marketing analytics and data monetization . The company has 300 employees in 18 countries and has over 150 customers. cVidya’s investors include Battery Ventures , Carmel Ventures , Hyperion, StageOne, Saints Capital and Plenus. Continue reading “cVidya”
CtrlShift
CtrlShift is a Singapore- headquartered programmatic marketing company. It was founded in January 2015 from the merger of three advertising technology companies: AdzCentral scientific media buying platform; digital consultancy Better; and ad-tech distribution company Asia Digital Ventures. [1] Continue reading “CtrlShift”
Cloudera
Cloudera Inc. is a United States -based software company that provides Apache Hadoop -based software, support and services, and training to business customers. Continue reading “Cloudera”
CBIG Consulting
CBIG Consulting is a consulting group that specializes in business intelligence , big data analytics, data warehouse and cloud computing analytics. Continue reading “CBIG Consulting”
Cambridge Technology Enterprises
Cambridge Technology Enterprises is a global IT services company. The company is predominantly US focused and serves companies like Schneider Electric, Hills Pet, Iron Mountain. Cambridge Technology Enterprises helps organizations through AI leveraging , big data , cloud & machine learning. The company was also recently assessed at CMMI v1.3 Level 5 with ISO 9001: 2008, ISO 27001: 2005 certifications. The company has a workforce of 350 with offices in Atlanta , Kansas , Louisville , San Francisco , Boston and Pittsburgh and development centers in Hyderabad , Chennai andBangalore in India. Continue reading “Cambridge Technology Enterprises”
Bright Computing
Bright Computing , Inc. is a developer of software for deploying and managing high-performance (HPC) clusters, big data clusters, and OpenStack in data centers and using cloud computing . [1] Continue reading “Bright Computing”
BigPanda
BigPanda is a technology company headquartered in Palo Alto, California . [1] The company’s flagship product is an IT systems management platform that aggregates and correlates IT alerts to create high-level IT incidents. [2] [3] Continue reading “BigPanda”
Big Data Scoring
Scoring Big Data is a cloud-based Service That lets consumer loan Lenders Improve quality and acceptance rates through the use of big data . The company was founded in 2013 and has offices in UK , Finland , Chile , Indonesia and Poland . The company ‘s services are aimed at all lenders – banks , payday lenders , peer – to – peer lending platforms , microfinance providers and leasing companies . [1] Continue reading “Big Data Scoring”
Partenariat Big Data
Big Data Partnership était une société spécialisée dans les services professionnels de big data basée à Londres , Royaume-Uni. Il fournit des services de conseil, de formation certifiée et de soutien aux entreprises basées en Europe, au Moyen-Orient et en Afrique . Continue reading “Partenariat Big Data”
Axtria
Axtria is a New Jersey- based technology company that develops and markets cloud-based data analytics services and solutions for business. [4] The company’s software is embedded into commercial processes to analyze data and provide insights. [5] The company is headquartered in Berkeley Heights, New Jersey , and has additional locations in California , Arizona, Georgia , Virginia , and Ireland and development centers in Boston , Chicago and Gurgaon, India . [6] [7] [3] Continue reading “Axtria”
Alpine Data Labs
Alpine Data Labs is an advanced analytics interface working with Apache Hadoop and big data . [1] [2] [3] [4] [5] [6] It provides a collaborative, visual environment to create and deploy analytical workflow and predictive models. [7] [8] This AIMS to make analytics more suitable for business analyst level staff, sales and other departments like using the data, Rather than Requiring a “data engineer” or “data scientist” Who Understands languages like MapReduce or Pig . [2] [9] [10] Continue reading “Alpine Data Labs”
Lucidworks
Lucidworks is a San Francisco, California -based enterprise search technology company offering an application development platform, commercial support, consulting, training and value-add software for open source. Apache Lucene and Apache Solr . Lucidworks is a private company founded in 2007 as Lucid Imagination and Publicly lancé on January 26, 2009. The company Was renamed to Lucidworks on August 8, 2012. [1] The company received Series A funding from Granite Ventures and Walden International in September 2008 ; In-Q-Telis a strategic investor. In August 2014, Lucidworks closed an $ 8 million Series C round with Shasta Ventures, Granite Ventures and Walden International participating. [2] In November 2015, Lucidworks closed a $ 21 million Series D round with Allegis Capital, and existing investors Shasta Ventures and Granite Ventures participating. [3] Continue reading “Lucidworks”
List of big data companies
This is an alphabetical list of notable companies using the marketing term big data : Continue reading “List of big data companies”
Web intelligence
Web intelligence is the area of scientific research and development explored the roles and That Makes use of artificial intelligence and information technology for new products, services and frameworks That are empowered by the World Wide Web . [1] Continue reading “Web intelligence”
Venice Time Machine
The Venice Time Machine is a large international project launched by the Swiss Federal Institute of Technology in Lausanne (EPFL) and the Ca ‘Foscari University of Venice in 2012 that aims to build a multidimensional collaborative model of Venice by creating an open digital archive of the cultural city heritage covering more than 1,000 years of evolution. [1] The project aims to trace circulation of news, money, commercial goods, migration, artistic and architectural patterns among others to create a Big Data of the Past. Its fulfillment would represent the largest database ever created on Venetian documents. [2]The project is an example of the new area of scholar activity that has emerged in the Digital Age: Digital Humanities . Continue reading “Venice Time Machine”
Synqera
Synqera is a technology software company, providing a service for personalizing of retail. The company is headquartered in Saint Petersburg , Russia . Continue reading “Synqera”
Social physics
Social Physics gold sociophysics is a field of science qui uses mathematical tools inspired by physics To Understand the behavior of human crowds. In a modern commercial use, it can also refer to the analysis of social phenomena with big data . Continue reading “Social physics”
Social media mining
Social media mining is the process of representing, analyzing, and extracting actionable patterns and trends from raw social media data . The term “mining” is an analogy to the resource extraction process of miningfor rare minerals. Resource extraction mining requires mining companies to sift through vast quanities of raw minerals; Likewise, social media “mining” requires human data analytics and automated software programs to sift through massive amounts of raw social media data (eg, social media usage, online behavior, sharing of content, connections between individuals, online buying behavior, etc.). ) in order to discern patterns and trends. These and other methods (or, for companies, new products, processes and services). Continue reading “Social media mining”
Social Credit System
The Social Credit System is a proposed Chinese government initiative [1] [2] [3] for developing a national reputation system . It has been reported to assign to “social credit” rating to each citizen based on government data and their economic and social status. [4] [3] [5] [6] [7] It works as a mass monitoring tool and uses the big data analysis technology . [8] In addition, it is also meant to operate on the Chinese market. [9] Continue reading “Social Credit System”
Session (web analytics)
In web analytics , a session , or visit is a unit of measurement of a user’s actions taken within a period of time or with regard to completion of a task. Sessions are also used in operational analytics and provision of user-specific recommendations . There are two primary methods used to define a session time-oriented approaches based is continuity in user activity and navigation-based approaches based continuity is in a chain of requested pages. Continue reading “Session (web analytics)”
Security visualization
Security Visualization is a subject that broadly covers the aspect of Big Data , Visualization , Human perception and Security . Each day, we are collecting more and more data in the form of data files. Big Data Mining Techniques Like Map Reduce help narrow the search for meaning in data. Data visualization is a data analytics technique, which is used to engage the human brain while finding patterns in data. Continue reading “Security visualization”
Savi Technology
About
Savi Technology was founded in 1989 and is based in Alexandria, Virginia . Continue reading “Savi Technology”
Data literacy
Data literacy is the ability to read, create and communicate data and has been formally described in varying ways. Discussion of the skills inherent to data literacy and feasible instructional methods-have Emerged as data collectionBecomes routinized and talk of data analysis and Big Data HAS Become commonplace in the news, business, [1] government [2] and society in countries across the world . [3] Continue reading “Data literacy”
Lambda architecture
Lambda architecture is a data-processing architecture designed to handle massive quantities of data by taking advantage of both batch and stream-processing methods. This approach to architecture attempts to balance latency , throughput , and fault-tolerance by using a combination of real-time data processing and data processing. The two view outputs may be joined before presentation. The rise of lambda architecture is correlated with the growth of big data , real-time analytics, and the drive to mitigate the latencies of map-reduce. [1] Continue reading “Lambda architecture”
IT operations analytics
In the fields of information technology (IT) and systems management , IT operations (ITOA) is an approach or method to retrieve, analyze, and report data for IT operations. ITOA may apply big data analytics to large datasets to produce business insights. [1] [2] In 2014, Gartner predicted its use to increase revenue or reduce costs. [3] By 2017, it is predicted that 15% of enterprises will use IT operations analytics technologies. [2] Continue reading “IT operations analytics”
Intelligence engine
An intelligence engine is a type of enterprise information management that combines business rule management , predictive , and prescriptive analytics to form a unified information-access platform that provides real-time intelligence through search technologies , dashboards and / or existing business infrastructure. Intelligence Engines are process and / or business problem specific, resulting in industry and / or function-specific marketing. They can be differentiated from enterprise resource planning (ERP)decision management functionality. Continue reading “Intelligence engine”
Industry 4.0
Industry 4.0 is a name for the current trend of automation and data exchange in manufacturing technologies. It includes cyber-physical systems , the Internet of things , cloud computing [1] [2] [3] [4] and cognitive computing . Continue reading “Industry 4.0”
Industrial big data
Industrial big data refers to a large amount of diversified time series generated at a high speed by industrial equipment,[1] known as the Internet of things[2]The term emerged in 2012 along with the concept of “Industry 4.0”, and refers to big data”, popular in information technology marketing, in that data created by industrial equipment might hold more potential business values.[3] Industrial big data takes advantage of industrial Internet technology. It uses raw data to support management decision making, so to reduce costs in maintenance and improve customer service.[2] Continue reading “Industrial big data”
Head / tail Breaks
Head / tail breaks is a clustering algorithm with heavy-tailed distributions such as power laws and lognormal distributions . The heavy-tailed distribution can be simply referred to the scaling pattern of large, small, or small, largest and smallest. The classification is done through a large part of the world (or called the head) and small (or called the tail). Arithmetic mean or average, and then recursively going for the division process. of far more small things than large ones [1]Head / tail breaks is not just for classification, but also for visualization of big data by keeping the head, since the head is self-similar to the whole. Head / tail breaks can be applied not only to vector data such as points, lines and polygons, but also to raster data like the digital elevation model (DEM). Continue reading “Head / tail Breaks”
GIS United
GIS United (GU / GIS Utd) is a union of GIS specialists who have a variety of backgrounds such as business administration, public administration, environmental engineering, mechanical engineering, statistics, urban engineering, architecture, historical studies, literature, art, etc. As a consulting firm to analyze Geo-spatial Big data specializes headquartered in Mapo Seogyo, Seoul, South Korea . Continue reading “GIS United”
Flutura Decision Sciences and Analytics
Flutura Decision Sciences and Analytics is an industrial Internet of things (IoT) company that focuses on machine to machine and big data analytics serving customers from manufacturing, energy and engineering industries. Its main offices are located in Palo Alto, California and has its development center in Bengaluru, India. Continue reading “Flutura Decision Sciences and Analytics”
ECL (data-centric programming language)
A centralized programming language is a declarative, data centric programming language designed in 2000 to allow a team of programmers to process large data across a high performance computing cluster without the programmer being involved in many of the lower level, imperative decisions. [1] [2] Continue reading “ECL (data-centric programming language)”
dataveillance
Dataveillance is the practice of monitoring and collecting metadata. [1] The word is a portmanteau of data and surveillance. [2] Dataveillance is concerned with the continuous monitoring of users’ communications and actions across various platforms. [3] For instance, dataveillance refers to the monitoring of data resulting from credit card transactions, GPS coordinates, emails, social networks , etc. Using digital media often leaves traces of data and creates a digital footprint of our activity. [4] This type of surveillance is not often known and is inconsistent. [5]Unlikesubversivity , where individuals willingly monitoring their activity, dataveillance is more discrete and unknown. Dataveillance may involve the monitoring of groups of individuals. There exist three types of dataveillance: personal dataveillance, mass dataveillance, and facilitiative mechanisms . [3] Continue reading “dataveillance”
DataOps
DataOps is an automated, process-oriented methodology, used by big data teams, to improve the quality and reduce the cycle time of data analytics . While DataOps began as a set of best practices, it has now become a new and independent approach to data analytics. [1] DataOps applies to the entire data lifecycle [2] from data preparation to reporting, and to the interconnected nature of the data analytics team and information technology operations. [3] From a process perspective and methodology, DataOps Applies Agile software development , DevOps [3] and the statistical process control used inlean manufacturing , to data analytics. [4] Continue reading “DataOps”
Datafication
Datafication is a modernization of many aspects of our life in computerized data [1] and transforming this information into new forms of value. [2] Kenneth Neil Cukier and Victor Mayer-Schoenberger introduced the term datafication in 2013. [3]
Data-centric security
Data-centric security is an approach to security that emphasizes the security of the data rather than the security of networks, servers, or applications. Data-centric security is Evolving Rapidly as companies increasingly Rely on digital information to Run Their Business and Big Data projects Become mainstream. [1] [2] [3] Data-centric security also enables organizations to overcome the problem of security and the protection of the environment. a relationship that is often obscured by the presentation of security as an end in itself. [4] Continue reading “Data-centric security”
Data Shadows
Data Shadows is the information that an individual unintentionally leaves behind. This information is then used by organizations and servers. [1] [2] [ full citation needed ] This information is a vastly detailed record of an individual’s everyday life, which includes the individual’s thoughts and interests, their communication and work information, the information about the organizations that they interact with. so forth. [1] The concept of data shadow is closely linked with Data footprints and Dataveillance . Data footprints and shadows produces information that has been dispersed to a dozen of organizations and servers. [3] [full quote needed ] Continue reading “Data Shadows”
Data lineage
Data lineage includes the data ‘s origins, what happens to it and where it moves over time. [1] Data lineage provides visibility into the process of data analysis . [2] Continue reading “Data lineage”
Draft: Data ethics
Big data ethics refers to the ethical dilemmas and concerns presented by big data technologies and industries. Big data is characterized by being continually produced, often without the producers’ direct intent. [1] The ethical concerns raised by big data from privacy and data ownership, to open data and democracy issues. Continue reading “Draft: Data ethics”
Continuous analytics
Continuous analytics is a data science process that abandons ETLs and complex batch data pipelines in favor of cloud-native and microservices paradigms. Continuous data processing enables realtime interactions and immediate insights with fewer resources. Continue reading “Continuous analytics”
Cambridge Analytica
Cambridge Analytica ( CA ) is a privately held company that combines data mining and data analysis with strategic communication for the electoral process. It was created in 2013 as an offshoot of its British parent company SCL Group to participate in American politics . [2] In 2014, CA was involved in 44 US political races. [3] The company is owned by Mostly the family of Robert Mercer , an American hedge-fund manager Who supports Many politically conservative causes. [2] [4] The firm maintains offices in New York City, Washington, DC , and London. [5] Continue reading “Cambridge Analytica”
Burst buffer
In the high-performance computing environment, the burst buffer is a fast and intermediate storage lnterm between the front-end computing processes and the back-end storage systems . It emerges as a fast storage solution to the ever-increasing performance of the gap between the processing and the input / output (I / O) bandwidth of the storage systems. [1] Burst buffer is built from high-performance storage devices, such as NVRAM and SSD . It is one of the largest I / O bandwidth providers in the world. Continue reading “Burst buffer”
BisQue (Bioimage Analysis and Management Platform)
BisQue [1] is a free, open source web-based platform for the exchange and exploration of large, complex datasets. It is being developed at the Vision Research Lab [2] at the University of California, Santa Barbara . BisQue specifically supports large scale, multi-dimensional multimodal-images and image analysis. Metadata is stored as an arbitrarily nested and linked tag / value peer, allowing for domain-specific data organization. Image analysis modules can be added to perform complex analysis tasks on compute clusters. Analysis results are stored in the database for further querying and processing. The data and analysis provenance is maintained for reproducibility of results. BisQue can be easily deployed in cloud computing environments or on computer clusters for scalability. BisQue has been integrated into the NSF Cyberinfrastructure project CyVerse. [3] The user interacts with BisQue via any modern web browser . Continue reading “BisQue (Bioimage Analysis and Management Platform)”
Big Data to Knowledge
Big Data to Knowledge (BD2K) is a project of the National Institutes of Health for knowledge extraction from big data . Continue reading “Big Data to Knowledge”
Big Data Maturity Model
Big Data Maturity Models (BDMM) are the artifacts used to measure Big Data maturity. [1] These models help organizations to create a structure around their Big Data capabilities and to identify where to start. [2] They provide tools that assist organizations to define their data and their organizations. BDMMs also provide a methodology for measuring the state of a company’s big data capability, the effort required to complete their current internship or phase of progress and progress to the next stage. Additionally, BDMMs measure and manage the speed of both the progress and the adoption of big data programs in the organization. [1] Continue reading “Big Data Maturity Model”
Big data
Big data is data sets That are so voluminous and complex That traditional data processing Application software are inadequate to deal with ’em. Big data challenges include capturing data , data storage , data analysis , search, sharing , transfer , visualization , querying , and updating information privacy . Volume, Variety and Velocity. Continue reading “Big data”
Astroinformatics
Astroinformatics is an interdisciplinary field of study involving the combination of astronomy , data science , informatics , and information / communications technologies. [1] [2] Continue reading “Astroinformatics”
Prescriptive analytics
Prescriptive analytics is the third and final phase of business analytics , which also includes descriptive and predictive analytics. [1] [2] Continue reading “Prescriptive analytics”
Predictive analytics
Predictive analytics encompasses a range of statistical techniques from predictive modeling , machine learning , and data mining that analyzes current and historical facts to future predictions. [1] [2] Continue reading “Predictive analytics”
Embedded analytics
Embedded analytics is the technology designed to make data analysis and business intelligence more accessible by all kinds of application or user. Continue reading “Embedded analytics”
Business analytics
Business analytics ( BA ) refers to the skills, technologies, practices for continuous iterative exploration and investigation of past business performance to insight gain and drive business planning. [1] Business analytics focuses on developing new insights and understanding of business performance based on data and statistical methods . In contrast, business intelligence traditionally focuses on a set of metrics to measure both performance and business planning, which is also based on data and statistical methods. [ quote needed ] Continue reading “Business analytics”
Analytics
Analytics is the discovery, interpretation, and communication of meaningful patterns in data . Especially valuable in areas rich with recorded information, analytics relating to the simultaneous application of statistics , computer programming and operations research to quantify performance. Continue reading “Analytics”
Data Administration
Administrative data are small data and are a type of big data . They are collected by governments or other organizations for non-statistical reasons to provide overviews on registration, transactions, and record keeping. [1] They evaluate part of the output of administrating a program. Birth and death records, regulating the crossing of people and goods over borders, pensions, and taxation. [2]These types of data are used in the supply of information. This allows administration data, when to turn into indicators, to show trends over time, and to reflect real world information. The management of this information includes the Internet, software, technology, telecommunications, databases and management systems, system development methods, information systems, etc. Managing the resources of the public sector is a complex routine. It begins with the collection of data, then goes through the hardware and software that stores, manipulates, and transforms the data. Public fonts are then addressed, including organizational policies and procedures. [3] Continue reading “Data Administration”
Timshel (company)
Timshel is a privately held data service startup company run by Michael Slaby . Slaby was formally involved in the digital initiative of Barack Obama’s presidential campaign. Timshel has around 50 employees in Chicago and New York . [1] Continue reading “Timshel (company)”
The Resilience Project
The Resilience Project is a project, carried out by the Icahn Institute for Genomics at Mount Sinai in collaboration with Sage Bionetworks . [1] Continue reading “The Resilience Project”
The Groundwork
The Groundwork is a privately held technology firm, run by Michael Slaby , which was formed in June 2014. [1] Campaign finance disclosures revealed that Hillary Clinton’s campaign was a client of the Groundwork. [2] [1] Most of the Groundwork’s employees are back-end software developers such as Netflix , DreamHost , and Google . [1] Continue reading “The Groundwork”
Superman memory crystal
Superman memory crystal [1] is a nanostructured glass for recording of 5-D digital data [2] using femtosecond laser writing process. [3] The memory crystal is capable of storing up to 360 terabytes worth of data [4] [5] for billions of years. [6] [7] [8] [9] The concept was experimentally demonstrated in 2013. [10] [11] [12] Continue reading “Superman memory crystal”
Michael Slaby
Michael Slaby currently runs the Chicago -based startup he founded, Timshel , [1] which developed the platform known as The Groundwork . [2] [3] Continue reading “Michael Slaby”
Data philanthropy
Data philanthropy describes a form of collaboration in which private sector companies share data for public benefit. [1] There are many uses of data philanthropy being explored from humanitarian, corporate, human rights, and academic use. Since introducing the term in 2011, the United Nations Global Pulse has advocated for a global “data philanthropy movement”. [2] Continue reading “Data philanthropy”
Civis Analytics
Civis Analytics is an Eric Schmidt -backed data science startup company founded by Dan Wagner in 2013. [1] Continue reading “Civis Analytics”
Big memory
Memory (RAM ) is a large memory device that has a large memory RAM ( random-access memory ) memory. Some workloads are databases, in-memory caches, and graph analytics. [1] Now, more generally, data science and big data. Continue reading “Big memory”
The value of open data
Open Data is the free availability and usability of – mostly public – data. The demand for it is based on the assumption that advantageous developments are supported as open government , if appropriate register and user-friendly prepared information be made publicly available and thus allow more transparency and cooperation. For this purpose use the Creator license models which the copyright , patents or other proprietary largely forego rights. Open Data is similar to this numerous other “Open” movements, such asopen source , open content , open access , open education and is a prerequisite for Open Government.
Definition of “Open Data”
Open data, all data holdings, in the interest of the general public are made available to the Company without any restriction to the free use, resulting in further distribution and for free further use freely available. [1] One might think about on teaching materials, spatial data , statistics, market information, scientific publications , medical research results, or radio and television broadcasts. In “Open Data” is not limited to databases of public administration, as well as privately operating companies, universities and radio stations as well as non-profit bodies produce relevant contributions. [1]
To indicate data as open data, there are various indications license such as CC Zero . Licenses that restrict the use of data, for example by prohibiting changes or commercial use do not comply with the agreement of the ” Berlin Declaration on Open Access to Knowledge in the Sciences and Humanities ” and are not considered as open data.
Demands the Open Data movement
The concept of Open Data is not new, however, is the term – unlike, for example, open access to date not been generally defined -. Open Data refers specifically to information outside of a text form , such as weather information, maps, genomes or medical data.Since this material is of commercial interest, it comes here often contradictory. Proponents of Open Data, however, argue it were dealing with common property , and the free use of the data must not be hindered by restrictions.
A typical case is to show the necessity of Open Data:
“Numerous scientists have pointed out the irony did right at the historical moment When We have the technologies to permit worldwide availability and distributed process of scientific data, broadening collaboration and accelerating the pace and depth of discovery […] we are busy locking up did data and Preventing the use of correspondingly advanced technologies on knowledge. “
“Many scientists have the irony pointed out that right now, at the time in history when we have the technologies that enable worldwide availability of scientific data and distributed processing of these, where cooperation will be deepened and discoveries can be accelerated accurately at this time we occupy our time, just closed to keep this data, thereby preventing the use of advanced technologies as to their development. “
Data producers often neglect the need to define user rights. For example, licensing exclude data unnecessarily by a further free use of a missing (free if applicable).
The Open Data movement not only calls for free access to data, but also generates these himself. One example is OpenStreetMap . Proponents claim that a democratic society is possible through the Open Data concept – allowing, for example, the German websiteTheyWorkForYou.com . To track the voting records of British MPs [3] In the context of data relating to a government that is also Open Government spoken. Rob McKinnon said in a presentation at the re: publica that “can lead to new power structures within a society, the loss of data privilege”. [4] Another good example is the page farmsubsidy.org showing to whom EU agricultural subsidies paid, which account for almost half of the total budget. Especially German politicians balk for some time that this information is public.
Data to meet the criteria of Open Data must be presented structured and machine-readable available so that they filter to search, and can be further processed by other applications. Data from government agencies, for example, are often referred to as PDF before and are therefore not further processed without problems.
Big data
The big data , literally “big data,” or big data (recommended 3 ), sometimes called Big Data 4 , designate sets of data become so large that they exceed the intuition and the human capacity for analysis and even those tools conventional computer database or information management 5.
The quantitative explosion (often redundant) of the digital data forced new ways of seeing and analyzing the world 6 . New orders of magnitude concern capturing, storing, searching, sharing, analyzing and visualizing data . The perspectives of big data processing are enormous and partly unsuspected; there is often talk of new possibilities for exploring information disseminated by the media 7 , knowledge and evaluation, trend analysis and prospective(climate, environmental or socio-political, etc.) and risk management (commercial, insurance, industrial, natural) and religious, cultural, political 8 , but also in terms of genomics or metagenomics 9 , for medicine (understanding brain function , epidemiology , eco – epidemiology …), meteorology and adaptation to climate change , management of complex energy networks (via smart grids or a future ” energy internet “), ecology(functioning and dysfunction of ecological networks, food webs with GBIF, for example), or security and the fight against crime 10 . The multiplicity of these applications already leaves already appearing a true economic ecosystem implying, already, the biggest players of the sector of the information technologies 11.
Some [who?] Assume that the big data could help companies reduce risk and facilitate decision-making, or create a difference through predictive analytics and “customer experience” more personalized and contextualized 12 .
Various experts, major institutions (such as MIT 13 in the United States), administrations 14 and specialists in the field of technologies or uses 15 consider the big data phenomenon as one of the major IT challenges of the 2010-2020 decade and in have made one of their new research and development priorities , which could notably lead to Artificial Intelligence being explored by self – learning artificial neural networks 16.
Dimensions
The big data is accompanied by the analytical referred to application development, that process data to make sense of 34 . These analyzes are called Big Analytics 35 or “data crushing”. They focus on complex quantitative data using distributed computing methods and statistics.
In 2001, a research report of the META Group (now Gartner ) 36 defines the issues inherent to the growth of data as being three-dimensional: complex analyzes meet the so-called “3V” rule (volume, velocity and variety 37 ). This model is still widely used today to describe this phenomenon 38 .
The global average annual growth rate of the big data technology and services market over the 2011-2016 period is expected to be 31.7%. This market is expected to reach $ 23.8 billion in 2016 (according to IDC March 2013). Big data should also represent 8% of European GDP in 2020 (AFDEL February 2013).
Volume
This is a relative dimension: big data, as Lev Manovitch noted in 2011 39 once defined “data sets large enough to require super-computers” , but it is quickly (in the years 1990/2000) use standard software on desktops to analyze or co-analyze large sets of data 40 .
The volume of stored data is growing: the digital data created around the world has grown from 1.2 zettabytes per year in 2010 to 1.8 zettabytes in 2011 41 , then 2.8 zettabytes in 2012 and will rise to 40 zettabytes in 2020. As an example, Twitter generated in January 2013, 7 terabytes of data each day and Facebook 10 terabytes 42 . In 2014, Facebook Hive generated 4,000 TB of data per day 43 .
It is the technical-scientific facilities (meteorology, etc.) that produce the most data [ref. necessary] . Many pharaonic projects are underway. The radio telescope ” Square Kilometer Array ” for example will produce 50 terabytes of data analyzed per day, at a rate of 7,000 terabytes of raw data per second 44 .
Variety
The volume of big data puts the data centers face a real challenge: the variety of data. These are not traditional relational data , these data are raw, semi-structured or even unstructured (however, unstructured data will have to be structured for use 45 ). These are complex data from the Web ( Web Mining ), text (Text Mining) and images (Image Mining). They can be public (Open Data, Web of the data), geo-demographic by block ( IP addresses ), or come under the property of the consumers (Profiles 360 °) [ref. necessary]. This makes them difficult to use with traditional tools.
The multiplication of collection tools on individuals and objects used to always collect more data 46 . And the analyzes are all the more complex as they relate more and more to the links between data of a different nature.
Velocity
Velocity is both the frequency with which the data are generated, captured, shared and updated 47 .
Growing data flows should be analyzed in near real-time ( data stream mining ) to meet the needs of chrono-sensitive processes 48 . For example, the systems put in place by the stock market and companies must be able to process these data before a new generation cycle has begun, with the risk for humans of losing much of the control of the economy. system when the main operators become “robots” capable of launching buy or sell orders at the nanosecond ( High Frequency Trading ) without having all the relevant analysis criteria for the medium and long term. Continue reading “Big data”