A New Kind of Science

A New Kind of Science is a best-selling, [1] controversial book by Stephen Wolfram , published by his own company in 2002. It contains an empirical and systematic study of computational systems such as cellular automata . Wolfram calls these systems simple programs and argues that the scientific philosophy and methods are appropriate for the study of other fields of science. Continue reading “A New Kind of Science”


Hydroinformatics is a branch of informatics qui concentrates on the implementation of information and communications technologies (ICTs) in Addressing the increasingly serious problems of the equitable and efficient use of water for Many different practical purposes. Growing out of the former discipline of computational hydraulics , the numerical simulation of water flows and related processes remains a mainstay of hydroinformatics, which encourages a focus not only on the technology and its application in a social context. Continue reading “Hydroinformatics”

Humanistic informatics

Humanistic Informatics (also known as Humanities Informatics ) is one of several names chosen for the study of the relationship between human culture and technology . The term is fairly common in Europe , is little purpose Known in the English-speaking world , though Digital Humanities (Also Known As Humanities Computing ) is in Many boxes Roughly equivalent. Continue reading “Humanistic informatics”

Museum informatics

Museum informatics [1] is an interdisciplinary field of study that refers to the theory and application of informatics by museums . It is in essence a sub-field of cultural informatics [2] at the intersection of culture , digital technology, and information science . In the context of the digital age, museums and archives, its place in the world has grown substantially and has connections with digital humanities . [3] Continue reading “Museum informatics”


Viroinformatics is an amalgamation of virology with bioinformatics , involving the application of information and communication technology in various aspects of viral research. Currently there are more than 100 different applications concerning diversity analysis, viral recombination, RNAi studies , drug design , protein-protein interaction , structural analysis and so on. [1] Continue reading “Viroinformatics”

Simulation governance

Governance is a managerial function concerned with assurance of reliability of information generated by numerical simulation . The term was introduced in 2011 [1] and specific technical requirements were addressed from the perspective of mechanical design in 2012 [2] . Its strategic importance was addressed in 2015 [3] [4] . At the 2017 NAFEMS World Congress in Stockholm has been identified as the first of eight ” big issues ” in numerical simulation . Continue reading “Simulation governance”

Geographic information system

geographic information system ( GIS ) is a system designed to capture, store, manipulate, analyze, manage, and present spatial or geographic data . The acronym GIS is sometimes used for geographical information science (GIScience) to refer to the academic discipline that studies geographic information systems [1] and is a broad domain within the broader academic discipline of geoinformatics . [2] What goes beyond a GIS is a spatial data infrastructure , a concept that has no such restrictive boundaries. Continue reading “Geographic information system”

Financial modeling

Financial modeling is the task of building an abstract representation (a model ) of a real world financial situation. [1] This is a mathematical model designed to represent (a simplified version of) the performance of a financial asset or portfolio of a business, project , or any other investment. Financial modeling is a general term that means different things to different users; the reference report is for accounting and corporate finance applications, or for quantitative finance applications. While there is some debate in the industry to the nature of financial modelingtradecraft , such as welding, or a science -the task of financial modeling has been gaining acceptance and rigor over the years. [2] Typically, financial modeling is understood to be an exercise in financial asset pricing, or a quantitative nature. In other words, financial modeling is about translating a set of hypotheses about the behavior of markets or agents into numerical predictions; for example, a firm’s decisions on investments (or the firm will invest 20% of assets), or investment returns [3] (returns on “stock A” will, on average, be 10% higher than the market’s returns). Continue reading “Financial modeling”

Environmental informatics

Environmental information is the science of information applied to environmental science . As a result, it provides the information processing and communication infrastructure to the interdisciplinary field of environmental science [1] , and the use of information and knowledge integration , the application of computational intelligence to environmental data and the identification of environmental impacts of information technology . The UK Natural Environment Research Councildefines environmental informatics as the “research and system development on the environment sciences relating to the creation, collection, storage, processing, modeling, interpretation, display and dissemination of data and information.” [2] Kostas Karatzas defined environmental information as the “creation of a new knowledge-paradigm ‘towards serving environmental management needs.” [3] Karatzas is an integrator of science , methods and techniques and not just the result of using information and software technology methods and tools for serving environmental engineering needs. ” Continue reading “Environmental informatics”

Computer simulation

Computer simulations reproduce the behavior of a system using a mathematical model . Computer simulations have become a useful tool for the mathematical modeling of many natural systems in physics ( computational physics ), astrophysics , climatology , chemistry and biology , human systems in economics , psychology , social science , and engineering . Simulation of a system is represented as the running of the system’s model. It can be used to explore and gain new insights into new technologyand to estimate the performance of systems too complex for analytical solutions . [1] Continue reading “Computer simulation”

Computational transportation science

Computational Transportation Science (CTS) is an emerging discipline that combines computer science and engineering with modeling, planning, and economic aspects of transportation . The discipline studies how to improve the safety, mobility, and sustainability of the system by taking advantage of information technologies and ubiquitous computing . A list of subjects encompassed by CTS can be found at include. [1] Continue reading “Computational transportation science”

Computational sustainability

Computational sustainability is a broad field that attempts to optimize societal, economic, and environmental resources using methods of mathematics and computer science fields. [1] Sustainability in this context is the ability to produce enough energy for the world to support its biological systems. Using the power of computers to process large quantities of information. [2] Continue reading “Computational sustainability”

Computational Statistics & Data Analysis

Computational Statistics & Data Analysis is a monthly peer-reviewed scientific journal covering research and applications of computational statistics and data analysis. The journal was established in 1983 and is the official journal of the International Association for Statistical Computing , [1] section of the International Statistical Institute . Continue reading “Computational Statistics & Data Analysis”

Computational statistics

Computational statistics , or statistical computing , is the interface between statistics and computer science . It is the area of computational science (or scientific computing) specific to the mathematical science of statistics . This area is also rapidly expanding to include a broader concept of computing as part of general statistical education . [1] Continue reading “Computational statistics”

Computational social science

Computational social science refers to the academic sub-disciplines concerned with computational approaches to the social sciences . This means that computers are used to model, simulate, and analyze social phenomena. Fields include computational economics , computational sociology , cliodynamics , culturomics , and the automated analysis of contents, in social and traditional media. It focuses on social and behavioral interactions and interactions through social simulation , modeling, network analysis, and media analysis. [1] Continue reading “Computational social science”

Computational semiotics

Computational semiotics is an interdisciplinary field that applies, conducts, and draws on research in logic , mathematics , the theory and practice of computation , formal and natural language studies , the cognitive sciences , and semiotics proper. A common theme of this work is the adoption of a sign-theoretic perspective on issues of artificial intelligence and knowledge representation . Many of its applications lie in the field of human-computer interaction (HCI) and fundamental devices of recognition. Continue reading “Computational semiotics”

Computational scientist

computational scientist is a person skilled in scientific computing . This person is usually a scientist , an engineer , or an applied mathematician who uses high-performance computers in different ways to advance the state-of-the-art in their respective applied disciplines; physics , chemistry , social sciences and so forth. Thus scientific computing has many influences such as economics, biology, law and medicine to name a few. Continue reading “Computational scientist”

Computational engineering

Computational science and engineering (CSE) is a relatively new discipline that deals with the development and application of computational models and simulations, often coupled with high-performance computing, to solve complex physical problems arising in engineering analysis and design (computational engineering) as well as natural phenomena (computational science). CSE has been described as the “third mode of discovery” (next to theory and experimentation).[1] In many fields, computer simulation is integral and therefore essential to business and research. Computer simulation provides the capability to enter fields that are either inaccessible to traditional experimentation or where carrying out traditional empirical inquiries is prohibitively expensive. CSE should neither be confused with pure computer science, nor with computer engineering, although a wide domain in the former is used in CSE (e.g., certain algorithms, data structures, parallel programming, high performance computing) and some problems in the latter can be modeled and solved with CSE methods (as an application area). Continue reading “Computational engineering”

Computational science

Computational science (also scientific computing or scientific computation ( SC )) is a rapidly growing multidisciplinary field that uses advanced computing capabilities to understand and solve complex problems. It is an area of ​​science which spans many disciplines, but at its core it involves the development of models and simulations to understand natural systems. Continue reading “Computational science”

Computational phylogenetics

Computational phylogenetics is the application of computational algorithms , methods, and programs to phylogenetic analyzes. The goal is to assemble a phylogenetic tree representing the evolutionary ancestry of a set of genes , species , or other taxa . For example, these techniques have been used to explore the family tree of hominid species [1] and the relationship between specific types of organisms. [2] Traditional phylogenetics related to morphological data obtained by measuring and quantifying the phenotypicmolecular nucleotide sequences encoding genes or amino acid sequences encoding proteins as the basis for classification. Numerous forms of molecular phylogenetics are closely related to and make extensive use of sequence alignment and refining phylogenetic trees, which are used to classify the evolutionary relationships between homologous genes represented in the genome of divergent species. The phylogenetic trees made by the computational methods are perfectly reproduce the evolutionary treethat represents the historical relationship between the species being analyzed. The historical species may also differ from the historical tree of an individual homologous gene. Continue reading “Computational phylogenetics”

Computational photography

Computational photography or computational imaging refers to digital image capture and processing techniques that use digital computation instead of optical processes. Computational photography can improve the capabilities of a camera, or introduce features that are not possible at all with film based photography. Examples of computational photography include in-camera computation of digital panoramas , [6] high-dynamic-range images , and light field cameras . Light field cameras, 3D image, enhanced depth-of-field, and selective de-focusing (or “post focus”). Enhanced depth-of-field reduces the need for mechanical focusing systems. All of these features use computational imaging techniques. Continue reading “Computational photography”

Computational particle physics

Computational particle physics refers to the methods and computing tools developed by particle physics research. Like computational chemistry or computational biology , it is, for particle physics both a specific branch and an interdisciplinary field relying on computer science, theoretical and experimental particle physics and mathematics. The main fields of computational particle physics are: lattice field theory (numerical computations), automatic calculation of particle interaction or decay (computer algebra), and event generators (stochastic methods). Continue reading “Computational particle physics”

Computational neuroscience

Computational Neuroscience (also theoretical neuroscience ) studies brain function in terms of the information processing properties of the structures That Make up the nervous system . [1] [2] It is an interdisciplinary computational science that links the various fields of neuroscience , cognitive science , and psychology with electrical engineering , computer science , mathematics , and physics . Continue reading “Computational neuroscience”

Computational neurogenetic modeling

Computational neurogenetic modeling (CNGM) is concerned with the study and development of dynamic neuronal models for modeling brain functions with respect to genes and dynamic interactions between genes. These include neural network models and their integration with gene network models. This area Brings together knowledge from various scientific disciplines, Such As computer and information science , neuroscience and cognitive science , genetics and molecular biology , as well as engineering . Continue reading “Computational neurogenetic modeling”

Computational musicology

Computational musicology is defined as the study of music with computational modeling and simulation. [1] It was started in the 1950s and originally did not use computers, but more of statistical and mathematical methods. Nowadays computational musicology depends largely on complex algorithms . Computer science, computer music, systematic musicology, music information retrieval, computational musicology, digital musicology, sound and music computing and music informatics. [2] Continue reading “Computational musicology”

Computational mechanics

Computational Mechanics is the discipline concerned with the use of computational methods to study phenomena governed by the principles of mechanics . Before the emergence of computational science (also called scientific computing) as a “third way” besides theoretical and experimental sciences, computational mechanics was widely considered to be a sub-discipline of applied mechanics . It is now considered to be a sub-discipline within computational science. Continue reading “Computational mechanics”

Computational magnetohydrodynamics

Computational magnetohydrodynamics (CMHD) is a rapidly developing branch of magnetohydrodynamics that uses numerical methods and algorithms to solve problems that involve electrically conducting fluids. Most of the methods used in CMHD are used in computational fluid dynamics . The complexity is arises from the presence of a magnetic field and its coupling with the fluid. One of the important issues is to numerically maintain the\ displaystyle \ nabla \ cdot (conservation of magnetic flux ) condition, from Maxwell’s equations , to avoid any unphysical effects. Continue reading “Computational magnetohydrodynamics”

Computational lithography

Computational lithography (also known as computational scaling ) is the set of mathematical and algorithmic approaches designed to improve the resolution achievable through photolithography . Computational lithography has come to the forefront of photolithography in 2008 as the semiconductor industry grappled with the challenges associated with the transition to 22 nanometer CMOS process technology and beyond. Continue reading “Computational lithography”

Computational lexicology

Computational lexicology is a branch of computational linguistics , which is concerned with the use of computers in the study of lexicon . It has been more narrowly described by some scholars (Amsler, 1980) as the use of computers in the study of machine-readable dictionaries . It is distinguished from computational lexicography , which more properly would be the use of computers in the construction of dictionaries, but some researchers have used computational lexicography as synonymous . Continue reading “Computational lexicology”

Computational law

Computational law is a branch of legal informatics concerned with the mechanization of legal reasoning (whether done by humans or by computers). [1] It emphasizes explicit behavioral constraints and eschews implicit rules of conduct. Importantly, there is a commitment to a level of rigor in specifying laws that is sufficient to support entirely mechanical processing. Continue reading “Computational law”

Computational journalism

Computational Journalism can be defined as the application of information gathering, organization, sensemaking, communication and dissemination of news information, while upholding values ​​of journalism and accuracy and verifiability. [1] The field draws on technical aspects of computer science including artificial intelligence, content analysis (NLP, vision, hearing), visualization, personalization and recommender systems, and aspects of social computing and information science . Continue reading “Computational journalism”

Computational immunology

In academia , computational immunology is a field of science that encompasses high-throughput genomic and bioinformatics approaches to immunology . The field’s main aim is to convert data into computational immunological problems, solve problems thesis using mathematical and computational approaches And Then thesis convert results into immunologically Meaningful interpretations. Continue reading “Computational immunology”

Computational group theory

In mathematics , computational group theory is the study of groups by means of computers. It is concerned with designing and analyzing algorithms and data structures to compute information about groups. The subject HAS Attracted interest Because For Many interesting groups (Including MOST of the sporadic groups ) it is impractical to perform calculations by hand. Continue reading “Computational group theory”

Computational geophysics

Computational geophysics entails rapid numerical computations that help analyzes of geophysical data and observations. High-performance computing is involved, due to the size and complexity of the geophysical data to be processed. The main computing requirements are 3D and 4D images of the sub-surface earth , Modeling and Migration of complex media, Tomography and inverse problems . Continue reading “Computational geophysics”

Computational geometry

Computational geometry is a branch of computer science devoted to the study of algorithms which can be stated in terms of geometry . Some purely geometrical problems arise from the study of computational geometric algorithms , and such problems are also considered to be part of computational geometry. While modern computational geometry is a recent development, it is one of the oldest fields of computation with history stretching back to antiquity. Continue reading “Computational geometry”

Computational genomics

Computational genomics (1) refers to the use of computational and statistical analysis to decipher biology from genome sequences and related data, [1] including both DNA and RNA sequences as well as other “post-genomic” data (ie DNA microarrays, which requires the genome sequence . These fields are also often referred to as Computational and Statistical Genetics/ genomics. As such, computational genomics can be considered as a subset of bioinformatics and computational biology , but with a focus on whole genomes (rather than individual genes) to understand the principles of the DNA of a species of biology and molecular biology. beyond. With the current abundance of massive biological datasets, computational studies have become one of the most important means to biological discovery. [2] Continue reading “Computational genomics”

Computational economics

Computational economics is a research discipline at the interface of computer science, economics, and management science.[1] This subject encompasses computational modelingof economic systems, whether agent-based,[2] general-equilibrium,[3] macroeconomic,[4] or rational-expectations,[5] computational econometrics and statistics,[6] computational finance, computational tools for the design of automated internet markets, programming tools specifically designed for computational economics, and pedagogical tools for the teaching of computational economics. Some of these areas are unique to computational economics, while others extend traditional areas of economics by solving problems that are difficult to study without the use of computers and associated numerical methods.[7] Continue reading “Computational economics”

Computational complexity theory

Computational complexity theory is a branch of the theory of computation in theoretical computer science That Focuses we Classifying computational problems selon Their inherent difficulty, and Relating Those classes to Each Other. A computational problem is understood to be a task which is in principle amenable to being solved by a computer, which is equivalent to stating that the problem may be solved by mechanical application of mathematical steps, such as an algorithm . Continue reading “Computational complexity theory”

Computational cognition

Computational cognition (sometimes referred to as computational cognition science ) is the study of the computational basis of learning and inference by mathematical modeling , computer simulation , and behavioralexperiments. In psychology, it is an approach which develops computational models based on experimental results. It seeks to understand the basis of the human method of processing information . Early on computational cognitive scientists sought to bring back and create a scientific form of Brentano’s psychology [1] Continue reading “Computational cognition”

Computational chemistry

Computational chemistry is a branch of chemistry that uses computer simulation to assist in solving chemical problems. It uses methods of theoretical chemistry , incorporated into effective computer programs , to calculate the structures and properties of molecules and solids. It is necessary because of the recent relative results concerning the hydrogen molecular ion (the dihydrogen cation , see references therein for more details), the quantum many-body problem can not be solved analytically, much less in closed form. While computational results Normally complement the information Obtained by chemical experimentspredictable hitherto unobserved chemical phenomena . It is widely used in the design of new drugs and materials. Continue reading “Computational chemistry”

Computational biology

Computational biology involves the development and application of data-analytical and theoretical methods, mathematical modeling and computational simulation techniques to the study of biological, behavioral, and social systems. [1] The field is broadly defined and includes foundations in computer science , applied mathematics , animation , statistics , biochemistry , chemistry , biophysics , molecular biology , genetics , genomics , ecology , evolution , anatomy ,neuroscience , and visualization . [2] Continue reading “Computational biology”

Computational auditory scene analysis

Computational auditory scene analysis (CASA) is the study of auditory scene analysis by computational means. [1] In essence, CASA systems are “machine listening” systems that are likely to have separate sources of sound sources. CASA differs from the field of blind signal separation fait que it is (at least to Some extent) based on the Mechanisms of the human auditory system , and THUS uses no more than two microphone recordings of an acoustic environment. It is related to the cocktail party problem .

Computational astrophysics

Computational astrophysics refers to the methods and computing tools developed and used in astrophysics research. Like computational chemistry or computational physics , it is a specific branch of theoretical astrophysicsand an interdisciplinary field relying on computer science , mathematics , and wider physics . Computational astrophysics is most often studied through an applied mathematics or astrophysics program at PhD level. Continue reading “Computational astrophysics”

Computational archeology

Computational archeology describes computer-based analytical methods for the study of long-term human behavior and behavioral evolution. As with other sub-disciplines that have prefixed ‘computational’ to their name (eg, computational biology , computational physics and computational sociology ), the term is reserved for (all mathematical) methods that could not be realistically performed without the aid of a computer. Continue reading “Computational archeology”

Community informatics

Community informatics (CI) is an interdisciplinary field that is concerned with using information and communication technology (ICT) to empower members of communities and supports their social, cultural, and economic development. [1] [2] Community informatics may contribute to enhancing democracy, supporting the development of social capital, and building well connected communities; moreover, it is probable that such similar actions may be new positive social change . [2] In community informatics, there are several considerations which are the social contexts, shared values, distinct processes that are taken by members of a community, and social and technical systems.[2] It is formally located as an academic discipline within a variety of academic faculties including information science , information systems , computer science , planning , development studies , and library science among others and draws on insights on community development from a range of backgrounds. disciplines. It is an interdisciplinary approach interested in using ICTs for different forms of community action, as distinct from ICT effects. [1] [3] Continue reading “Community informatics”


Pathformatics (also known as chemoinformatics , chemoinformatics and chemical informatics ) is the use of computer and informational techniques applied to a range of problems in the field of chemistry . These in silicotechniques are used, for example, in pharmaceutical companies in the process of drug discovery . These methods can also be used in chemical and allied industries in various other forms. Continue reading “Cheminformatics”

Biodiversity informatics

Biodiversity Informatics is the application of informatics techniques for biodiversity information for management, presentation, discovery, exploration and analysis. It typically builds on a foundation of taxonomic , biogeographic , or ecological information stored in digital form, which, with the application of modern computer techniques, can yield new ways to view and analyze existing information. yet exist (see niche modeling). Biodiversity Informatics is a Relatively young discipline (the term coined Was in gold around 1992) HAS goal Hundreds of Practitioners worldwide, Including the Numerous Individuals Involved with the design and building of taxonomic databases . The term “Biodiversity Informatics” is generally used in the broad sense of the term; The term ” bioinformatics ” is often used synonymously with the computerized handling of data in the specialized area of molecular biology .

Agent-based computational economics

Agent-based computational economics ( ACE ) is the area of computational economics that studies economic processes, including all economies , as dynamic systems of interacting agents . As such, it falls in the paradigm of complex adaptive systems . [1] Corresponding agent-based models , the ” agents ” are “computational objects modeled as interacting according to rules” over space and time, not real people. The rules are formulated to model behavior and social interactions based on incentives and information. [2]Such rules could also be the result of optimization, made using AI methods (such as Q-learning and other reinforcement learning techniques). [3] Continue reading “Agent-based computational economics”


XGBoost [1] is an open-source software library that provides the gradient boosting framework for C ++ , Java , Python , [2] R , [3] and Julia . [4] It works on Linux , Windows , [5] and macOS . [6] From the project description, it aims to provide a “Scalable, Portable and Distributed Gradient Boosting (GBM, GBRT, GBDT) Library”. Other than running on a single machine, it also supports the distributed processing frameworksApache Hadoop , Spark Apache , and Apache Flink . It has gained much popularity and attention recently as it was the algorithm of choice for many winning teams of machine learning competitions. [7] Continue reading “Xgboost”

Virtuoso Universal Server

Virtuoso Universal Server is a middleware and database engine that combines the functionality of a traditional Relational Database Management System (RDBMS), object-relational database (ORDBMS), virtual database , RDF , XML , free-text , web application server and file server functionality in a single system. Virtuoso is a “universal server”; it allows a single multithreaded server processthat implements multiple protocols. The open source edition of Virtuoso Universal Server is also known as OpenLink Virtuoso . The software has-been developed by OpenLink Software with Kingsley Uyi Idehen and Orri Erling as the chief software architects . Continue reading “Virtuoso Universal Server”

Apache Spark

Apache Spark is an open-source cluster-computing framework . Originally Developed at the University of California, Berkeley ‘s AMPLab , the Spark codebase Was later Donated to the Apache Software Foundation , qui HAS maintained it since. Spark provides an interface for full programming with implicit data parallelism and fault tolerance . Continue reading “Apache Spark”


SAP HANA is an in-memory , column-oriented , relational database management system developed and marketed by SAP SE . [1] [2] Its primary function is a database server and is retrieved as requested by the applications. In addition, it Performs advanced analytics ( predictive analytics , spatial data processing , text analytics, text search, streaming analytics , graph data processing ) and includes ETL capabilities as well as an Application Server . Continue reading “SAP HANA”


MonetDB is an open source column-oriented database management system developed at the Wiskunde Centrum & Informatica (CWI) in the Netherlands . It was designed to provide high performance on complex queries against large databases, such as combining tables with millions of rows and millions of rows. MonetDB has been applied in high-performance applications for online analytical processing , data mining , geographic information system (GIS), [1] Resource Description Framework (RDF), [2] text retrieval and sequence alignmentprocessing.[3] Continue reading “MonetDB”

Predix (software)

Predix is General Electric’s software platform for the collection and analysis of data from industrial machines. [1] General Electric plans to support the growing industrial Internet of things with cloud servers and an app store . [2] GE is a member of the Industrial Internet Consortium, which works with the development and use of industrial internet technologies. [3] Continue reading “Predix (software)”

Draft: MindSphere

MindSphere is an open cloud platform or “IoT operating system” [1] developed by Siemens for applications in the context of the Internet of Things ( IoT ). [2] MindSphere stores operational data and makes it accessible through digital applications (“MindApps”) to enable industrial customers to make decisions based on valuable factual information. [3] The system is used in such applications as automated production and vehicle fleet management. [2] [4] Continue reading “Draft: MindSphere”

Apache Hadoop

Apache Hadoop ( / h ə d u p / ) is an open source software framework used for distributed storage and processing of dataset of big data using the MapReduce programming model . It consists of computer clusters built from commodity hardware . All the modules in Hadoop are designed with a fundamental assumption that hardware failures are common occurrences and should be handled by the framework. [2] Continue reading “Apache Hadoop”

Apache Cassandra

Apache Cassandra is a free and open-source distributed NoSQL database management system designed to handle large amounts of data across many commodity servers , providing high availability with no single point of failure . Cassandra offers robust support for multiple spanning datacenter clusters , [1] with asynchronous masterless replication allowing low latency operations for all clients. Continue reading “Apache Cassandra”

Apache SystemML

Apache SystemML is a flexible machine learning system that automatically scales to Spark and Hadoop clusters. SystemML’s distinguishing characteristics are:

    1. Algorithm customizability via R-like and Python-like languages.
    2. Multiple execution modes, including Standalone, Spark Batch, Spark MLContext, Hadoop Batch, and JMLC.
    3. Automatic optimization based on data and cluster characteristics to ensure both efficiency and scalability.

Continue reading “Apache SystemML”

Apache Mahout

Apache Mahout is a project of the Apache Software Foundation to Produce free implementations of distributed gold Otherwise scalable machine learning algorithms Focused Primarily in the areas of collaborative filtering , clustering and classification. Many of the implementations use the Apache Hadoop platform. [2] [3] Mahout also provides Java libraries for common math operations and Java primitive collections. Mahout is a work in progress; the number of implemented algorithms has grown quickly, [4] but various algorithms are still missing. Continue reading “Apache Mahout”

Apache Beam

Apache Beam is an open source unified programming model to define and execute data processing pipelines , including ETL , batch and stream (continuous) processing. [1] Beam Pipelines are defined by one of the provided SDKs and executed in one of the Beam’s supported runners (distributed processing back-ends) including Apache Apex , Apache Flink , Apache Spark , and Google Cloud Dataflow [2] Continue reading “Apache Beam”

Smart, connected products

Smart, connected products are products, assets and other things embedded with processors, sensors, software and connectivity that allow to be exchanged between the product and its environment, manufacturer, operator / user, and other products and systems. Connectivity also enables some capabilities of the product to exist outside the physical device, which is known as the cloud product. The data collected from these products can be analyzed to inform decision-making, to enable operational efficiencies and to continuously improve the performance of the product. Continue reading “Smart, connected products”

Flutura Decision Sciences and Analytics

Flutura Decision Sciences and Analytics is an industrial Internet of things (IoT) company that focuses on machine to machine and big data analytics serving customers from manufacturing, energy and engineering industries. Its main offices are located in Palo Alto, California and has its development center in Bengaluru, India. Continue reading “Flutura Decision Sciences and Analytics”


Carriots is an Application hosting and development platform ( Platform as a Service ) specially designed for projects related to the Internet of Things (IoT) and Machine to Machine (M2M). Enables data collection of goods objects (the things part), store it, builds powerful applications with few lines of code and integration IT systems (the internet part). Carriots provides a development environment, APIs and hosting for IoT projects development. Continue reading “Carriots”

Machine to machine

Machine to machine refers to direct communication between devices using any communications channel , including wired and wireless. [1] [2] Machine to machine communication can include industrial instrumentation, enabling a sensor or meter to communicate the data it records (such as temperature, inventory level, etc.) to application software that can use it (for example, adjusting an industrial process based on temperature or placing orders to replenish inventory). [3] Such communication was successful by having a remote network of machines relay information back to a central hub for analysis, which would then be rerouted into a system like a personal computer. [4] Continue reading “Machine to machine”


VoloMetrix, Inc. is an American subsidiary of Microsoft based in Seattle, Washington . VoloMetrix sells people analytics software that combines data from collaboration platforms to create data visualizations and dashboards . At the end of April 2013, the company raised $ 3.3M in series A funding from Shasta Ventures . [2] In October 2014, VoloMetrix announced a round of funding with Shasta Ventures and Split Rock Partners that raised $ 12M. [3] In September 2015, Microsoft announced that they had acquiredthe company, but did not disclose the amount. The acquisition was made to improve existing Microsoft offers Microsoft Office 365 and Microsoft Delve . [4] Continue reading “VoloMetrix”


Hod HaSharon , Israel, and offices in New York and Singapore, ThetaRay is a cyber security and big data analytics company . The company provides a platform for the detection of unknown threat and risks to protect critical infrastructure [1] and financial services. The platform is also used to uncover unknown opportunities based on big data. [2] The company uses patented mathematical algorithms developed by the company founders. [3] Continue reading “ThetaRay”


Teradata Corporation is a provider of database- products and services. The company was formed in 1979 in Brentwood, California , as a collaboration between researchers at Caltech and Citibank’s advanced technology group. [2] The company was acquired by NCR Corporation in 1991, and subsequently spun-off as an independent public company on October 1, 2007. Continue reading “Teradata”


Talend ( Pronunciation: TAL-end ) is a software integration vendor. The company provides big data , cloud storage , data integration , data management , master data management , data quality , data preparation and enterprise application integration software and services. [1] The company is headquartered in Redwood City, California . [2] Continue reading “Talend”

Sumo Logic

Sumo Logic is a cloud-based log management and analytics service that leverages machine-generated data to deliver real-time IT insights. [1] Headquartered in Redwood City , California , Sumo Logic was founded in April 2010 by ArcSight veterans Kumar Saurabh and Christian Beedgen, and Accident Partners , DFJ Growth , Greylock Partners , Institutional Venture Partners , Sequoia Capital , Sutter Hill Venturesand angel investor Shlomo Kramer . [2] While Sumo Logic Remained in stealth fashion for two years, it unveiled icts cloud-based log management platform with Series B funding of $ 15 million in January 2012. [1] The round of Series E funding annoncé in June 2015 Brings the company’s total venture capital backing to $ 160.5 million. [3] On June 27 the company closed its Series F round for $ 75 million and is on path to IPO [4] . As of June 2017, the company has collected VC funding totaling $ 230 million. Continue reading “Sumo Logic”


Sojern is a provider of a data-driven traveler that uses programmatic buying and learning technology. [1] [2] Sojerns, OTAs , OTAs , to collect anonymized (non-personally identifiable) travelers based on these sites. [2] [3] The company uses this data to target travelers and deliver advertising across a number of media channels. [1] [3] Sojern is currently headquartered in San Francisco, with key offices in New York, Omaha, Dubai, Singapore, London and Dublin.[1] [4] Continue reading “Sojern”

Semantic Research

Semantic Research, Inc. is a privately held software company headquartered in San Diego, California with flagship offices in Washington, DC and Tampa, FL . Semantic Research (not to be confused with Symantec ), is a California C-corporation that offers patented, graph-based knowledge discovery, analysis and visualization software technology. [1] [2] Its most popular product is a link analysis software application called SEMANTICA Pro. Continue reading “Semantic Research”


SalesforceIQ (formerly RelateIQ), a subsidiary of Salesforce.com , is an American enterprise software company based in Palo Alto, California . The company’s software is a relationship intelligence platform that combines data from email systems, smartphone calls, and enhancements to augment or replace standard relationship management tools or database solutions. It scans “about 10,000 emails, calendar entries, and other data points per minute at first run”. [1] Continue reading “SalesforceIQ”

Rocket U2

Rocket U2 is a suite of database management (DBMS) and supporting software now owned by Rocket Software . It includes two MultiValue database platforms: UniData and UniVerse . [1] Both of These products are operating environments qui current run is Unix , Linux and Windows operating systems . [2] [3] They are both derivatives of the Pick operating system . [4]The family also includes developerand web-enabling technologies including SystemBuilder / SB + , SB / XA , U2 Web Development Environment (WebDE), UniObjects and wIntegrate . [1] Continue reading “Rocket U2”

Premise (company)

Premise is an American data company that tracks alternative economic indicators, such as local produce prices, and aggregates insights on consumption and inflation to governments and financial institutions. [1] [2] [3] [4] [5] Co-founders David Soloff and Joe Reisinger previously cam from MetaMarkets, an online advertising analytics company co-founded by Soloff. [6] Continue reading “Premise (company)”

Palantir Technologies

Palantir Technologies is a private American software and services company which specializes in big data analysis . Headquartered in Palo Alto, California , Palantir Gotham and Palantir Metropolis. Palantir Gotham is used by counter-terrorism analysts at offices in the United States Intelligence Community (USIC) and United States Department of Defense , fraud investigators at the Recovery Accountability and Transparency Board , and cyber analysts at Information Warfare Monitor, while Palantir Metropolis is used by hedge funds, banks, and financial services firms. [3] [4] Continue reading “Palantir Technologies”


Medio is a business-to-business mobile analytics provider based in Seattle , WA. The company processes pre-existing data [2] to provide historic and predictive analytics . Medio is built on a cloud-based [3] Hadoop platform and is designed to interpret big data for mobile enterprise. Medio has had various partners including: IBM , Rovio , [4] Verizon , T-Mobile , [5]ABC , and Disney [6] Continue reading “Medio”

User: Maxhercask / sandbox

Cask Data , dba ‘Cask’, is a privately held information technology company, established in 2011, with its headquarters located in Palo Alto, California . It provides software and services that enable broad, data-intensive enterprises – such as Thomson Reuters [1] – and many other diverse clients to accelerate their ability to extract value from their big data investments. Continue reading “User: Maxhercask / sandbox”


MarkLogic Corporation is an American software business that develops and provides an enterprise NoSQL database, also named MarkLogic . The company was founded in 2001 and is based in San Carlos , California . MarkLogic is privately held with over 500 employees and has offices throughout the United States , Europe , Asia , and Australia . Continue reading “MarkLogic”


MapR Technologies, Inc. is an enterprise software company headquartered in Santa Clara, California . MapR overall Provides access to a wide variety of data sources from a single cluster, Including big data workloads Such As Apache Hadoop and Apache Spark , a distributed file system, a multi-model database management system , and event streaming. Combining analytics in real-time with operational applications, its technology runs on both commodity hardware and public cloud computing services. Continue reading “MapR”

Groundhog Technologies

Groundhog Technologies is a privately held company founded in 2001 and is headquartered in Cambridge, Massachusetts, USA. As a spin-off of MIT Media Lab , [1] [2] it was a semi-finalist in MIT’s $ 50k Entrepreneurship Competition in 2000 and was incorporated the following year. [3] [4] The company received the first round of financing from major Japanese corporations and Their venture capital arms in November 2002, Marubeni , Yasuda Enterprise Development and Japan Asia Investment Co. [5] [6] It received second round of financing in 2004 and since then has become self-sustainable. [7] Continue reading “Groundhog Technologies”


Greenplum was a big data analytics company headquartered in San Mateo , California . Greenplum was acquired by EMC Corporation in July 2010. [1] Starting in 2012 its database management system software became known as the Pivotal Greenplum Database sold through Pivotal Software and is currently being developed by the Greenplum Database and open source community Pivotal. Continue reading “Greenplum”


Databricks is a company founded by the creators of Apache Spark , [1] which aims to help customers with cloud-based big data processing using Spark. [2] [3] Databricks grew out of the AMPLab project at UC Berkeley, which was involved in making Apache Spark , a distributed computing framework built atop Scala . Databricks Develops a web-based platform for working with Spark That Provides automated cluster management and IPython -style notebooks . In addition to building the Databricks platform, the company is co-organizing massive open online coursesabout Spark [4] Spark – Spark Summit. Continue reading “Databricks”


cVidya Networks is a provider of big data analytics for communications and digital service providers . cVidya’s market includes business protection and business growth, including revenue insurance , fraud management, marketing analytics and data monetization . The company has 300 employees in 18 countries and has over 150 customers. cVidya’s investors include Battery Ventures , Carmel Ventures , Hyperion, StageOne, Saints Capital and Plenus. Continue reading “cVidya”

Cambridge Technology Enterprises

Cambridge Technology Enterprises is a global IT services company. The company is predominantly US focused and serves companies like Schneider Electric, Hills Pet, Iron Mountain. Cambridge Technology Enterprises helps organizations through AI leveraging , big data , cloud & machine learning. The company was also recently assessed at CMMI v1.3 Level 5 with ISO 9001: 2008, ISO 27001: 2005 certifications. The company has a workforce of 350 with offices in Atlanta , Kansas , Louisville , San Francisco , Boston and Pittsburgh and development centers in Hyderabad , Chennai andBangalore in India. Continue reading “Cambridge Technology Enterprises”

Big Data Scoring

Scoring Big Data is a cloud-based Service That lets consumer loan Lenders Improve quality and acceptance rates through the use of big data . The company was founded in 2013 and has offices in UK , Finland , Chile , Indonesia and Poland . The company ‘s services are aimed at all lenders – banks , payday lenders , peer – to – peer lending platforms , microfinance providers and leasing companies . [1] Continue reading “Big Data Scoring”


Axtria is a New Jersey- based technology company that develops and markets cloud-based data analytics services and solutions for business. [4] The company’s software is embedded into commercial processes to analyze data and provide insights. [5] The company is headquartered in Berkeley Heights, New Jersey , and has additional locations in California , Arizona, Georgia , Virginia , and Ireland and development centers in Boston , Chicago and Gurgaon, India . [6] [7] [3] Continue reading “Axtria”

Alpine Data Labs

Alpine Data Labs is an advanced analytics interface working with Apache Hadoop and big data . [1] [2] [3] [4] [5] [6] It provides a collaborative, visual environment to create and deploy analytical workflow and predictive models. [7] [8] This AIMS to make analytics more suitable for business analyst level staff, sales and other departments like using the data, Rather than Requiring a “data engineer” or “data scientist” Who Understands languages like MapReduce or Pig . [2] [9] [10] Continue reading “Alpine Data Labs”


Lucidworks is a San Francisco, California -based enterprise search technology company offering an application development platform, commercial support, consulting, training and value-add software for open source. Apache Lucene and Apache Solr . Lucidworks is a private company founded in 2007 as Lucid Imagination and Publicly lancé on January 26, 2009. The company Was renamed to Lucidworks on August 8, 2012. [1] The company received Series A funding from Granite Ventures and Walden International in September 2008 ; In-Q-Telis a strategic investor. In August 2014, Lucidworks closed an $ 8 million Series C round with Shasta Ventures, Granite Ventures and Walden International participating. [2] In November 2015, Lucidworks closed a $ 21 million Series D round with Allegis Capital, and existing investors Shasta Ventures and Granite Ventures participating. [3] Continue reading “Lucidworks”

Venice Time Machine

The Venice Time Machine is a large international project launched by the Swiss Federal Institute of Technology in Lausanne (EPFL) and the Ca ‘Foscari University of Venice in 2012 that aims to build a multidimensional collaborative model of Venice by creating an open digital archive of the cultural city heritage covering more than 1,000 years of evolution. [1] The project aims to trace circulation of news, money, commercial goods, migration, artistic and architectural patterns among others to create a Big Data of the Past. Its fulfillment would represent the largest database ever created on Venetian documents. [2]The project is an example of the new area of ​​scholar activity that has emerged in the Digital Age: Digital Humanities . Continue reading “Venice Time Machine”

Social media mining

Social media mining is the process of representing, analyzing, and extracting actionable patterns and trends from raw social media data . The term “mining” is an analogy to the resource extraction process of miningfor rare minerals. Resource extraction mining requires mining companies to sift through vast quanities of raw minerals; Likewise, social media “mining” requires human data analytics and automated software programs to sift through massive amounts of raw social media data (eg, social media usage, online behavior, sharing of content, connections between individuals, online buying behavior, etc.). ) in order to discern patterns and trends. These and other methods (or, for companies, new products, processes and services). Continue reading “Social media mining”

Social Credit System

The Social Credit System is a proposed Chinese government initiative [1] [2] [3] for developing a national reputation system . It has been reported to assign to “social credit” rating to each citizen based on government data and their economic and social status. [4] [3] [5] [6] [7] It works as a mass monitoring tool and uses the big data analysis technology . [8] In addition, it is also meant to operate on the Chinese market. [9] Continue reading “Social Credit System”

Session (web analytics)

In web analytics , a session , or visit is a unit of measurement of a user’s actions taken within a period of time or with regard to completion of a task. Sessions are also used in operational analytics and provision of user-specific recommendations . There are two primary methods used to define a session time-oriented approaches based is continuity in user activity and navigation-based approaches based continuity is in a chain of requested pages. Continue reading “Session (web analytics)”

Security visualization

Security Visualization is a subject that broadly covers the aspect of Big Data , Visualization , Human perception and Security . Each day, we are collecting more and more data in the form of data files. Big Data Mining Techniques Like Map Reduce help narrow the search for meaning in data. Data visualization is a data analytics technique, which is used to engage the human brain while finding patterns in data. Continue reading “Security visualization”

Data literacy

Data literacy is the ability to read, create and communicate data and has been formally described in varying ways. Discussion of the skills inherent to data literacy and feasible instructional methods-have Emerged as data collectionBecomes routinized and talk of data analysis and Big Data HAS Become commonplace in the news, business, [1] government [2] and society in countries across the world . [3] Continue reading “Data literacy”

Lambda architecture

Lambda architecture is a data-processing architecture designed to handle massive quantities of data by taking advantage of both batch and stream-processing methods. This approach to architecture attempts to balance latency , throughput , and fault-tolerance by using a combination of real-time data processing and data processing. The two view outputs may be joined before presentation. The rise of lambda architecture is correlated with the growth of big data , real-time analytics, and the drive to mitigate the latencies of map-reduce. [1] Continue reading “Lambda architecture”

IT operations analytics

In the fields of information technology (IT) and systems management , IT operations (ITOA) is an approach or method to retrieve, analyze, and report data for IT operations. ITOA may apply big data analytics to large datasets to produce business insights. [1] [2] In 2014, Gartner predicted its use to increase revenue or reduce costs. [3] By 2017, it is predicted that 15% of enterprises will use IT operations analytics technologies. [2] Continue reading “IT operations analytics”

Intelligence engine

An intelligence engine is a type of enterprise information management that combines business rule management , predictive , and prescriptive analytics to form a unified information-access platform that provides real-time intelligence through search technologies , dashboards and / or existing business infrastructure. Intelligence Engines are process and / or business problem specific, resulting in industry and / or function-specific marketing. They can be differentiated from enterprise resource planning (ERP)decision management functionality. Continue reading “Intelligence engine”

Industrial big data

Industrial big data refers to a large amount of diversified time series generated at a high speed by industrial equipment,[1] known as the Internet of things[2]The term emerged in 2012 along with the concept of “Industry 4.0”, and refers to big data”, popular in information technology marketing, in that data created by industrial equipment might hold more potential business values.[3] Industrial big data takes advantage of industrial Internet technology. It uses raw data to support management decision making, so to reduce costs in maintenance and improve customer service.[2] Continue reading “Industrial big data”

Head / tail Breaks

Head / tail breaks is a clustering algorithm with heavy-tailed distributions such as power laws and lognormal distributions . The heavy-tailed distribution can be simply referred to the scaling pattern of large, small, or small, largest and smallest. The classification is done through a large part of the world (or called the head) and small (or called the tail). Arithmetic mean or average, and then recursively going for the division process. of far more small things than large ones [1]Head / tail breaks is not just for classification, but also for visualization of big data by keeping the head, since the head is self-similar to the whole. Head / tail breaks can be applied not only to vector data such as points, lines and polygons, but also to raster data like the digital elevation model (DEM). Continue reading “Head / tail Breaks”

GIS United

GIS United (GU / GIS Utd) is a union of GIS specialists who have a variety of backgrounds such as business administration, public administration, environmental engineering, mechanical engineering, statistics, urban engineering, architecture, historical studies, literature, art, etc. As a consulting firm to analyze Geo-spatial Big data specializes headquartered in Mapo Seogyo, Seoul, South Korea . Continue reading “GIS United”

Flutura Decision Sciences and Analytics

Flutura Decision Sciences and Analytics is an industrial Internet of things (IoT) company that focuses on machine to machine and big data analytics serving customers from manufacturing, energy and engineering industries. Its main offices are located in Palo Alto, California and has its development center in Bengaluru, India. Continue reading “Flutura Decision Sciences and Analytics”


Dataveillance is the practice of monitoring and collecting metadata. [1] The word is a portmanteau of data and surveillance. [2] Dataveillance is concerned with the continuous monitoring of users’ communications and actions across various platforms. [3] For instance, dataveillance refers to the monitoring of data resulting from credit card transactions, GPS coordinates, emails, social networks , etc. Using digital media often leaves traces of data and creates a digital footprint of our activity. [4] This type of surveillance is not often known and is inconsistent. [5]Unlikesubversivity , where individuals willingly monitoring their activity, dataveillance is more discrete and unknown. Dataveillance may involve the monitoring of groups of individuals. There exist three types of dataveillance: personal dataveillance, mass dataveillance, and facilitiative mechanisms . [3] Continue reading “dataveillance”


DataOps is an automated, process-oriented methodology, used by big data teams, to improve the quality and reduce the cycle time of data analytics . While DataOps began as a set of best practices, it has now become a new and independent approach to data analytics. [1] DataOps applies to the entire data lifecycle [2] from data preparation to reporting, and to the interconnected nature of the data analytics team and information technology operations. [3] From a process perspective and methodology, DataOps Applies Agile software development , DevOps [3] and the statistical process control used inlean manufacturing , to data analytics. [4] Continue reading “DataOps”

Data-centric security

Data-centric security is an approach to security that emphasizes the security of the data rather than the security of networks, servers, or applications. Data-centric security is Evolving Rapidly as companies increasingly Rely on digital information to Run Their Business and Big Data projects Become mainstream. [1] [2] [3] Data-centric security also enables organizations to overcome the problem of security and the protection of the environment. a relationship that is often obscured by the presentation of security as an end in itself. [4] Continue reading “Data-centric security”

Data Shadows

Data Shadows is the information that an individual unintentionally leaves behind. This information is then used by organizations and servers. [1] [2] [ full citation needed ] This information is a vastly detailed record of an individual’s everyday life, which includes the individual’s thoughts and interests, their communication and work information, the information about the organizations that they interact with. so forth. [1] The concept of data shadow is closely linked with Data footprints and Dataveillance . Data footprints and shadows produces information that has been dispersed to a dozen of organizations and servers. [3] [full quote needed ] Continue reading “Data Shadows”

Cambridge Analytica

Cambridge Analytica ( CA ) is a privately held company that combines data mining and data analysis with strategic communication for the electoral process. It was created in 2013 as an offshoot of its British parent company SCL Group to participate in American politics . [2] In 2014, CA was involved in 44 US political races. [3] The company is owned by Mostly the family of Robert Mercer , an American hedge-fund manager Who supports Many politically conservative causes. [2] [4] The firm maintains offices in New York City, Washington, DC , and London. [5] Continue reading “Cambridge Analytica”

Burst buffer

In the high-performance computing environment, the burst buffer is a fast and intermediate storage lnterm between the front-end computing processes and the back-end storage systems . It emerges as a fast storage solution to the ever-increasing performance of the gap between the processing and the input / output (I / O) bandwidth of the storage systems. [1] Burst buffer is built from high-performance storage devices, such as NVRAM and SSD . It is one of the largest I / O bandwidth providers in the world. Continue reading “Burst buffer”

BisQue (Bioimage Analysis and Management Platform)

BisQue [1] is a free, open source web-based platform for the exchange and exploration of large, complex datasets. It is being developed at the Vision Research Lab [2] at the University of California, Santa Barbara . BisQue specifically supports large scale, multi-dimensional multimodal-images and image analysis. Metadata is stored as an arbitrarily nested and linked tag / value peer, allowing for domain-specific data organization. Image analysis modules can be added to perform complex analysis tasks on compute clusters. Analysis results are stored in the database for further querying and processing. The data and analysis provenance is maintained for reproducibility of results. BisQue can be easily deployed in cloud computing environments or on computer clusters for scalability. BisQue has been integrated into the NSF Cyberinfrastructure project CyVerse. [3] The user interacts with BisQue via any modern web browser . Continue reading “BisQue (Bioimage Analysis and Management Platform)”

Big Data Maturity Model

Big Data Maturity Models (BDMM) are the artifacts used to measure Big Data maturity. [1] These models help organizations to create a structure around their Big Data capabilities and to identify where to start. [2] They provide tools that assist organizations to define their data and their organizations. BDMMs also provide a methodology for measuring the state of a company’s big data capability, the effort required to complete their current internship or phase of progress and progress to the next stage. Additionally, BDMMs measure and manage the speed of both the progress and the adoption of big data programs in the organization. [1] Continue reading “Big Data Maturity Model”

Big data

Big data is data sets That are so voluminous and complex That traditional data processing Application software are inadequate to deal with ’em. Big data challenges include capturing data , data storage , data analysis , search, sharing , transfer , visualization , querying , and updating information privacy . Volume, Variety and Velocity. Continue reading “Big data”

Business analytics

Business analytics ( BA ) refers to the skills, technologies, practices for continuous iterative exploration and investigation of past business performance to insight gain and drive business planning. [1] Business analytics focuses on developing new insights and understanding of business performance based on data and statistical methods . In contrast, business intelligence traditionally focuses on a set of metrics to measure both performance and business planning, which is also based on data and statistical methods. quote needed ] Continue reading “Business analytics”


Analytics is the discovery, interpretation, and communication of meaningful patterns in data . Especially valuable in areas rich with recorded information, analytics relating to the simultaneous application of statistics , computer programming and operations research to quantify performance. Continue reading “Analytics”

Data Administration

Administrative data are small data and are a type of big data . They are collected by governments or other organizations for non-statistical reasons to provide overviews on registration, transactions, and record keeping. [1] They evaluate part of the output of administrating a program. Birth and death records, regulating the crossing of people and goods over borders, pensions, and taxation. [2]These types of data are used in the supply of information. This allows administration data, when to turn into indicators, to show trends over time, and to reflect real world information. The management of this information includes the Internet, software, technology, telecommunications, databases and management systems, system development methods, information systems, etc. Managing the resources of the public sector is a complex routine. It begins with the collection of data, then goes through the hardware and software that stores, manipulates, and transforms the data. Public fonts are then addressed, including organizational policies and procedures. [3] Continue reading “Data Administration”

The Groundwork

The Groundwork is a privately held technology firm, run by Michael Slaby , which was formed in June 2014. [1] Campaign finance disclosures revealed that Hillary Clinton’s campaign was a client of the Groundwork. [2] [1] Most of the Groundwork’s employees are back-end software developers such as Netflix , DreamHost , and Google . [1] Continue reading “The Groundwork”

Data philanthropy

Data philanthropy describes a form of collaboration in which private sector companies share data for public benefit. [1] There are many uses of data philanthropy being explored from humanitarian, corporate, human rights, and academic use. Since introducing the term in 2011, the United Nations Global Pulse has advocated for a global “data philanthropy movement”. [2] Continue reading “Data philanthropy”

The value of open data

Open Data is the free availability and usability of – mostly public – data. The demand for it is based on the assumption that advantageous developments are supported as open government , if appropriate register and user-friendly prepared information be made publicly available and thus allow more transparency and cooperation. For this purpose use the Creator license models which the copyright , patents or other proprietary largely forego rights. Open Data is similar to this numerous other “Open” movements, such asopen source , open content , open access , open education and is a prerequisite for Open Government.

Definition of “Open Data”

Open data, all data holdings, in the interest of the general public are made available to the Company without any restriction to the free use, resulting in further distribution and for free further use freely available. [1] One might think about on teaching materials, spatial data , statistics, market information, scientific publications , medical research results, or radio and television broadcasts. In “Open Data” is not limited to databases of public administration, as well as privately operating companies, universities and radio stations as well as non-profit bodies produce relevant contributions. [1]

To indicate data as open data, there are various indications license such as CC Zero . Licenses that restrict the use of data, for example by prohibiting changes or commercial use do not comply with the agreement of the ” Berlin Declaration on Open Access to Knowledge in the Sciences and Humanities ” and are not considered as open data.

Demands the Open Data movement

The concept of Open Data is not new, however, is the term – unlike, for example, open access  to date not been generally defined -. Open Data refers specifically to information outside of a text form , such as weather information, maps, genomes or medical data.Since this material is of commercial interest, it comes here often contradictory. Proponents of Open Data, however, argue it were dealing with common property , and the free use of the data must not be hindered by restrictions.

A typical case is to show the necessity of Open Data:

“Numerous scientists have pointed out the irony did right at the historical moment When We have the technologies to permit worldwide availability and distributed process of scientific data, broadening collaboration and accelerating the pace and depth of discovery […] we are busy locking up did data and Preventing the use of correspondingly advanced technologies on knowledge. “

“Many scientists have the irony pointed out that right now, at the time in history when we have the technologies that enable worldwide availability of scientific data and distributed processing of these, where cooperation will be deepened and discoveries can be accelerated accurately at this time we occupy our time, just closed to keep this data, thereby preventing the use of advanced technologies as to their development. “

John Wilbanks, executive director, Science Commons [2]

Data producers often neglect the need to define user rights. For example, licensing exclude data unnecessarily by a further free use of a missing (free if applicable).

The Open Data movement not only calls for free access to data, but also generates these himself. One example is OpenStreetMap . Proponents claim that a democratic society is possible through the Open Data concept – allowing, for example, the German websiteTheyWorkForYou.com . To track the voting records of British MPs [3] In the context of data relating to a government that is also Open Government spoken. Rob McKinnon said in a presentation at the re: publica that “can lead to new power structures within a society, the loss of data privilege”. [4] Another good example is the page farmsubsidy.org showing to whom EU agricultural subsidies paid, which account for almost half of the total budget. Especially German politicians balk for some time that this information is public.

Data to meet the criteria of Open Data must be presented structured and machine-readable available so that they filter to search, and can be further processed by other applications. Data from government agencies, for example, are often referred to as PDF before and are therefore not further processed without problems.

Continue reading “The value of open data”

Posted in Web

Big data

The big data , literally “big data,” or big data (recommended 3 ), sometimes called Big Data 4 , designate sets of data become so large that they exceed the intuition and the human capacity for analysis and even those tools conventional computer database or information management 5.

The quantitative explosion (often redundant) of the digital data forced new ways of seeing and analyzing the world 6 . New orders of magnitude concern capturing, storing, searching, sharing, analyzing and visualizing data . The perspectives of big data processing are enormous and partly unsuspected; there is often talk of new possibilities for exploring information disseminated by the media 7 , knowledge and evaluation, trend analysis and prospective(climate, environmental or socio-political, etc.) and risk management (commercial, insurance, industrial, natural) and religious, cultural, political 8 , but also in terms of genomics or metagenomics 9 , for medicine (understanding brain function , epidemiology , eco – epidemiology …), meteorology and adaptation to climate change , management of complex energy networks (via smart grids or a future ”  energy internet  “), ecology(functioning and dysfunction of ecological networks, food webs with GBIF, for example), or security and the fight against crime 10 . The multiplicity of these applications already leaves already appearing a true economic ecosystem implying, already, the biggest players of the sector of the information technologies 11.

Some [who?] Assume that the big data could help companies reduce risk and facilitate decision-making, or create a difference through predictive analytics and “customer experience” more personalized and contextualized 12 .

Various experts, major institutions (such as MIT 13 in the United States), administrations 14 and specialists in the field of technologies or uses 15 consider the big data phenomenon as one of the major IT challenges of the 2010-2020 decade and in have made one of their new research and development priorities , which could notably lead to Artificial Intelligence being explored by self – learning artificial neural networks 16.


The big data is accompanied by the analytical referred to application development, that process data to make sense of 34 . These analyzes are called Big Analytics 35 or “data crushing”. They focus on complex quantitative data using distributed computing methods and statistics.

In 2001, a research report of the META Group (now Gartner ) 36 defines the issues inherent to the growth of data as being three-dimensional: complex analyzes meet the so-called “3V” rule (volume, velocity and variety 37 ). This model is still widely used today to describe this phenomenon 38 .

The global average annual growth rate of the big data technology and services market over the 2011-2016 period is expected to be 31.7%. This market is expected to reach $ 23.8 billion in 2016 (according to IDC March 2013). Big data should also represent 8% of European GDP in 2020 (AFDEL February 2013).


This is a relative dimension: big data, as Lev Manovitch noted in 2011 39 once defined “data sets large enough to require super-computers” , but it is quickly (in the years 1990/2000) use standard software on desktops to analyze or co-analyze large sets of data 40 .

The volume of stored data is growing: the digital data created around the world has grown from 1.2 zettabytes per year in 2010 to 1.8 zettabytes in 2011 41 , then 2.8 zettabytes in 2012 and will rise to 40 zettabytes in 2020. As an example, Twitter generated in January 2013, 7 terabytes of data each day and Facebook 10 terabytes 42 . In 2014, Facebook Hive generated 4,000 TB of data per day 43 .

It is the technical-scientific facilities (meteorology, etc.) that produce the most data [ref. necessary] . Many pharaonic projects are underway. The radio telescope ” Square Kilometer Array ” for example will produce 50 terabytes of data analyzed per day, at a rate of 7,000 terabytes of raw data per second 44 .


The volume of big data puts the data centers face a real challenge: the variety of data. These are not traditional relational data , these data are raw, semi-structured or even unstructured (however, unstructured data will have to be structured for use 45 ). These are complex data from the Web ( Web Mining ), text (Text Mining) and images (Image Mining). They can be public (Open Data, Web of the data), geo-demographic by block ( IP addresses ), or come under the property of the consumers (Profiles 360 °) [ref. necessary]. This makes them difficult to use with traditional tools.

The multiplication of collection tools on individuals and objects used to always collect more data 46 . And the analyzes are all the more complex as they relate more and more to the links between data of a different nature.


Velocity is both the frequency with which the data are generated, captured, shared and updated 47 .

Growing data flows should be analyzed in near real-time ( data stream mining ) to meet the needs of chrono-sensitive processes 48 . For example, the systems put in place by the stock market and companies must be able to process these data before a new generation cycle has begun, with the risk for humans of losing much of the control of the economy. system when the main operators become “robots” capable of launching buy or sell orders at the nanosecond ( High Frequency Trading ) without having all the relevant analysis criteria for the medium and long term. Continue reading “Big data”

Posted in Web