Data science on the Google cloud platform : Implementing end-to-end real-time data pipelines : From ingest to machine learning
Learn how easy it is to apply sophisticated statistical and machine learning methods to real-world problems when you build using Google Cloud Platform (GCP). This hands-on guide shows data engineers and data scientists how to implement an end-to-end data pipeline with cloud native tools on GCP. You'll work through a sample business decision by employing a variety of data science approaches. Follow along by building a data pipeline in your own project on GCP, and discover how to solve data science problems in a transformative and more collaborative way. Employ best practices in building highly scalable data and ML pipelines on Google Cloud Automate and schedule data ingest using Cloud Run Create and populate a dashboard in Data Studio Build a real-time analytics pipeline using Pub/Sub, Dataflow, and BigQuery Conduct interactive data exploration with BigQuery Create a Bayesian model with Spark on Cloud Dataproc Forecast time series and do anomaly detection with BigQuery ML Aggregate within time windows with Dataflow Train explainable machine learning models with Vertex AI Operationalize ML with Vertex AI Pipelines
Data science in theory and practice : Techniques for big data analytics and complex data sets
Delivers a comprehensive treatment of the mathematical and statistical models useful for analyzing data sets arising in various disciplines, like banking, finance, health care, bioinformatics, security, education, and social services. Written in five parts, the book examines some of the most commonly used and fundamental mathematical and statistical concepts that form the basis of data science. The authors go on to analyze various data transformation techniques useful for extracting information from raw data, long memory behavior, and predictive modeling. Readers will also learn from topics like: Analyses of foundational theoretical subjects, including the history of data science, matrix algebra and random vectors, and multivariate analysis A comprehensive examination of time series forecasting, including the different components of time series and transformations to achieve stationarity Introductions to both the R and Python programming languages, including basic data types and sample manipulations for both languages An exploration of algorithms, including how to write one and how to perform an asymptotic analysis A comprehensive discussion of several techniques for analyzing and predicting complex data sets
Data science for economics and finance : Methodologies and applications
The book starts with an introduction on the use of data science technologies in economics and finance and is followed by thirteen chapters showing success stories of the application of specific data science methodologies, touching on particular topics related to novel big data sources and technologies for economic analysis (e.g. social media and news); big data models leveraging on supervised/unsupervised (deep) machine learning; natural language processing to build economic and financial indicators; and forecasting and nowcasting of economic variables through time series analysis.
Data Science for Civil Engineering : A Beginner's Guide
Explains use of data science-based techniques for modeling and providing optimal solutions to complex problems in civil engineering. It discusses civil engineering problems like air, water and land pollution, climate crisis, transportation infrastructures, traffic and travel modes, mobility services, and so forth. Divided into two sections, the first one deals with the basics of data science and essential mathematics while the second section covers pertinent applications in structural and environmental engineering, construction management, and transportation.
Data science and data analytics : Opportunities and challenges
Gives the concept of data science, tools, and algorithms that exist for many useful applications / Provides many challenges and opportunities in data science and data analytics that help researchers to identify research gaps or problems / Identifies many areas and uses of data science in the smart era / Applies data science to agriculture, healthcare, graph mining, education, security, etc.
Data Science and Classification
This volume provides new methodological developments in data analysis and classification. A wide range of topics is covered that includes the measurement of similarity and dissimilarity, methods for classification and clustering, network and graph analyses, analysis of symbolic data, and web mining. Apart from structural and theoretical results the book shows how to apply the proposed to a variety of problems, for example in medicine, microarray analysis, social network structures, and music. The combination of new methodological advances with the wide range of real applications collected in this volume is of special value for researchers when choosing the appropriate among newly developed analytical tools for their research problems in classification and data analysis.
Data science and analytics ; 5th International conference on recent developments in science, engineering and technology, REDSET 2019, Gurugram, India, November 15–16, 2019, Revised Selected Papers, Part II
This two-volume set (CCIS 1229 and CCIS 1230) constitutes the refereed proceedings of the 5th International Conference on Recent Developments in Science, Engineering and Technology, REDSET 2019, held in Gurugram, India, in November 2019. The 74 revised full papers presented were carefully reviewed and selected from total 353 submissions. The papers are organized in topical sections on data centric programming; next generation computing; social and web analytics; security in data science analytics; big data analytics
Data science and analytics ; 5th International conference on recent developments in science, engineering and technology, REDSET 2019, Gurugram, India, November 15–16, 2019, Revised Selected Papers, Part I
This two-volume set (CCIS 1229 and CCIS 1230) constitutes the refereed proceedings of the 5th International Conference on Recent Developments in Science, Engineering and Technology, REDSET 2019, held in Gurugram, India, in November 2019. The 74 revised full papers presented were carefully reviewed and selected from total 353 submissions. The papers are organized in topical sections on data centric programming; next generation computing; social and web analytics; security in data science analytics; big data analytics.
Data science ; 6th International Conference of Pioneering Computer Scientists, Engineers and Educators, ICPCSEE 2020, Taiyuan, China, September 18-21, 2020, Proceedings, Part II
This two volume set (CCIS 1257 and 1258) constitutes the refereed proceedings of the 6th International Conference of Pioneering Computer Scientists, Engineers and Educators, ICPCSEE 2020 held in Taiyuan, China, in September 2020. The 98 papers presented in these two volumes were carefully reviewed and selected from 392 submissions. The papers are organized in topical sections: database, machine learning, network, graphic images, system, natural language processing, security, algorithm, application, and education.
Data Science ; 6th International Conference of Pioneering Computer Scientists, Engineers and Educators, ICPCSEE 2020, Taiyuan, China, September 18-21, 2020, Proceedings, Part I
This two volume set (CCIS 1257 and 1258) constitutes the refereed proceedings of the 6th International Conference of Pioneering Computer Scientists, Engineers and Educators, ICPCSEE 2020 held in Taiyuan, China, in September 2020. The 98 papers presented in these two volumes were carefully reviewed and selected from 392 submissions. The papers are organized in topical sections: database, machine learning, network, graphic images, system, natural language processing, security, algorithm, application, and education.
Data Quality : Concepts, Methodologies and Techniques
Batini and Scannapieco present a comprehensive and systematic introduction to the wide set of issues related to data quality. They start with a detailed description of different data quality dimensions, like accuracy, completeness, and consistency, and their importance in different types of data, like federated data, web data, or time-dependent data, and in different data categories classified according to frequency of change, like stable, long-term, and frequently changing data. The book's extensive description of techniques and methodologies from core data quality research as well as from related fields like data mining, probability theory, statistical data analysis, and machine learning gives an excellent overview of the current state of the art.
Data mining with computational intelligence
Finding information hidden in data is as theoretically difficult as it is practically important. With the objective of discovering unknown patterns from data, the methodologies of data mining were derived Wang and Fu present in detail the state of the art on how to utilize fuzzy neural networks, multilayer perceptron neural networks, radial basis function neural networks, genetic algorithms, and support vector machines in such applications. They focus on three main data mining tasks: data dimensionality reduction, classification, and rule extraction. The book is targeted at researchers in both academia and industry, while graduate students and developers of data mining systems will also profit from the detailed algorithmic descriptions.
Data Mining for Biomedical Applications ; PAKDD 2006 Workshop, BioDM 2006, Singapore, April 9, 2006, Proceedings
This book constitutes the refereed proceedings of the International Workshop on Data Mining for Biomedical Applications, BioDM 2006, held in Singapore in conjunction with the 10th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD 2006). The 14 revised full papers presented together with 1 keynote talks were carefully reviewed and selected from 35 submissions. The papers are organized in topical sections on protein-protein interactions, database and search, bio data clustering, and in-silico diagnosis.
Data mining and machine learning applications
Elaborates in detail on the current needs of data mining and machine learning and promotes mutual understanding among research in different disciplines, thus facilitating research development and collaboration. Data, the latest currency of today’s world, is the new gold. In this new form of gold, the most beautiful jewels are data analytics and machine learning. Data mining and machine learning are considered interdisciplinary fields. Data mining is a subset of data analytics and machine learning involves the use of algorithms that automatically improve through experience based on data.
Data mining : Concepts, models, methods, and algorithms ; 3rd ed.
Presents the latest techniques for analyzing and extracting information from large amounts of data in high-dimensional data spaces. Explores big data and cloud computing Examines deep learning Includes information on convolutional neural networks (CNN) Offers reinforcement learning Contains semi-supervised learning and S3VM Reviews model evaluation for unbalanced data
Data Mining : A Knowledge Discovery Approach
This book on data mining details the unique steps of the knowledge discovery process that prescribe the sequence in which data mining projects should be performed. Data Mining offers an authoritative treatment of all development phases from problem and data understanding through data preprocessing to deployment of the results. This knowledge discovery approach is what distinguishes this book from other texts in the area. It concentrates on data preparation, clustering and association rule learning (required for processing unsupervised data), decision trees, rule induction algorithms, neural networks, and many other data mining methods, focusing predominantly on those which have proven successful in data mining projects.
Data Management Technologies and Applications ; 8th International Conference, DATA 2019, Prague, Czech Republic, July 26–28, 2019, Revised Selected Papers
This book constitutes the thoroughly refereed proceedings of the 8th International Conference on Data Management Technologies and Applications, DATA 2019, held in Prague, Czech Republic, in July 2019. The 8 revised full papers were carefully reviewed and selected from 90 submissions. The papers deal with the following topics: decision support systems, data analytics, data and information quality, digital rights management, big data, knowledge management, ontology engineering, digital libraries, mobile databases, object-oriented database systems, and data integrity.
Data Complexity in Pattern Recognition
Data Complexity in Pattern Recognition is unique in its comprehensive coverage and multidisciplinary approach from various methodological and practical perspectives. Researchers and practitioners alike will find this book an insightful reference to learn about the current status of available techniques as well as application areas.
Data analytics, computational statistics, and operations research for engineers : Methodologies and applications
Presents applications of computationally intensive methods, inference techniques, and survival analysis models. It discusses how data mining extracts information and how machine learning improves the computational model based on the new information.
Data Analysis, Machine Learning and Applications ; Proceedings of the 31st Annual Conference of the Gesellschaft für Klassifikation e.V., Albert-Ludwigs-Universität Freiburg, March 7–9, 2007
This volume contains the revised versions of selected papers in the field of data analysis, machine learning and applications presented during the 31st Annual Conference of the German Classification Society (Gesellschaft für Klassifikation - GfKl).



















