These are some examples of Data Science books in our catalog. Use SouthernSearch the online CSU catalog for books at Southern and the other CSCU libraries.
Data Science in Context: Foundations, challenges, opportunities by Alfred Z. SpectorData science is the foundation of our modern world. It underlies applications used by billions of people every day, providing new tools, forms of entertainment, economic growth, and potential solutions to difficult, complex problems. These opportunities come with significant societal consequences, raising fundamental questions about issues such as data quality, fairness, privacy, and causation. In this book, four leading experts convey the excitement and promise of data science and examine the major challenges in gaining its benefits and mitigating its harms. They offer frameworks for critically evaluating the ingredients and the ethical considerations needed to apply data science productively, illustrated by extensive application examples. The authors' far-ranging exploration of these complex issues will stimulate data science practitioners and students, as well as humanists, social scientists, scientists, and policy makers, to study and debate how data science can be used more effectively and more ethically to better our world.
Call Number: Main Stacks QA76.9.D343 S68425 2023
ISBN: 9781009272209
Publication Date: 2022-10-20
Discriminating Data: Correlation, Neighborhoods, and the New Politics of Recognition by Wendy Hui Kyong Chun; Alex Barnett (Illustrator)How big data and machine learning encode discrimination and create agitated clusters of comforting rage. In Discriminating Data, Wendy Hui Kyong Chun reveals how polarization is a goal--not an error--within big data and machine learning. These methods, she argues, encode segregation, eugenics, and identity politics through their default assumptions and conditions. Correlation, which grounds big data's predictive potential, stems from twentieth-century eugenic attempts to "breed" a better future. Recommender systems foster angry clusters of sameness through homophily. Users are "trained" to become authentically predictable via a politics and technology of recognition. Machine learning and data analytics thus seek to disrupt the future by making disruption impossible. Chun, who has a background in systems design engineering as well as media studies and cultural theory, explains that although machine learning algorithms may not officially include race as a category, they embed whiteness as a default. Facial recognition technology, for example, relies on the faces of Hollywood celebrities and university undergraduates--groups not famous for their diversity. Homophily emerged as a concept to describe white U.S. resident attitudes to living in biracial yet segregated public housing. Predictive policing technology deploys models trained on studies of predominantly underserved neighborhoods. Trained on selected and often discriminatory or dirty data, these algorithms are only validated if they mirror this data. How can we release ourselves from the vice-like grip of discriminatory data? Chun calls for alternative algorithms, defaults, and interdisciplinary coalitions in order to desegregate networks and foster a more democratic big data.
Call Number: Main Circulating Stacks QA76.9.B45 C57 2021
ISBN: 9780262046220
Publication Date: 2021-11-02
Econometrics and Data Science: Apply Data Science Techniques to Model Complex Problems and Implement Solutions for Economic Problems by Tshepo Chris NokeriGet up to speed on the application of machine learning approaches in macroeconomic research. This book brings together economics and data science. Author Tshepo Chris Nokeri begins by introducing you to covariance analysis, correlation analysis, cross-validation, hyperparameter optimization, regression analysis, and residual analysis. In addition, he presents an approach to contend with multi-collinearity. He then debunks a time series model recognized as the additive model. He reveals a technique for binarizing an economic feature to perform classification analysis using logistic regression. He brings in the Hidden Markov Model, used to discover hidden patterns and growth in the world economy. The author demonstrates unsupervised machine learning techniques such as principal component analysis and cluster analysis. Key deep learning concepts and ways of structuring artificial neural networks are explored along with training them and assessing their performance. The Monte Carlo simulation technique is applied to stimulate the purchasing power of money in an economy. Lastly, the Structural Equation Model (SEM) is considered to integrate correlation analysis, factor analysis, multivariate analysis, causal analysis, and path analysis. After reading this book, you should be able to recognize the connection between econometrics and data science. You will know how to apply a machine learning approach to modeling complex economic problems and others beyond this book. You will know how to circumvent and enhance model performance, together with the practical implications of a machine learning approach in econometrics, and you will be able to deal with pressing economic problems. What You Will Learn Examine complex, multivariate, linear-causal structures through the path and structural analysis technique, including non-linearity and hidden states Be familiar with practical applications of machine learning and deep learning in econometrics Understand theoretical framework and hypothesis development, and techniques for selecting appropriate models Develop, test, validate, and improve key supervised (i.e., regression and classification) and unsupervised (i.e., dimension reduction and cluster analysis) machine learning models, alongside neural networks, Markov, and SEM models Represent and interpret data and models Who This Book Is For Beginning and intermediate data scientists, economists, machine learning engineers, statisticians, and business executives
Call Number: Online Book
ISBN: 9781484274330
Publication Date: 2021-10-27
Foundations of Data Science by Avrim Blum; John Hopcroft; Ravindran KannanThis book provides an introduction to the mathematical and algorithmic foundations of data science, including machine learning, high-dimensional geometry, and analysis of large networks. Topics include the counterintuitive nature of data in high dimensions, important linear algebraic techniques such as singular value decomposition, the theory of random walks and Markov chains, the fundamentals of and important algorithms for machine learning, algorithms and analysis for clustering, probabilistic models for large networks, representation learning including topic modelling and non-negative matrix factorization, wavelets and compressed sensing. Important probabilistic techniques are developed including the law of large numbers, tail inequalities, analysis of random projections, generalization guarantees in machine learning, and moment methods for analysis of phase transitions in large random graphs. Additionally, important structural and complexity measures are discussed such as matrix norms and VC-dimension. This book is suitable for both undergraduate and graduate courses in the design and analysis of algorithms for data.
Call Number: Main Circulating Stacks QA76 .B5675 2020
ISBN: 9781108485067
Publication Date: 2020-01-23
Interactive Visual Data Analysis by Christian Tominski; Heidrun SchumannIn the age of big data, being able to make sense of data is an important key to success. Interactive Visual Data Analysisadvocates the synthesis of visualization, interaction, and automatic computation to facilitate insight generation and knowledge crystallization from large and complex data. The book provides a systematic and comprehensive overview of visual, interactive, and analytical methods. It introduces criteria for designing interactive visual data analysis solutions, discusses factors influencing the design, and examines the involved processes. The reader is made familiar with the basics of visual encoding and gets to know numerous visualization techniques for multivariate data, temporal data, geo-spatial data, and graph data. A dedicated chapter introduces general concepts for interacting with visualizations and illustrates how modern interaction technology can facilitate the visual data analysis in many ways. Addressing today's large and complex data, the book covers relevant automatic analytical computations to support the visual data analysis. The book also sheds light on advanced concepts for visualization in multi-display environments, user guidance during the data analysis, and progressive visual data analysis. The authors present a top-down perspective on interactive visual data analysis with a focus on concise and clean terminology. Many real-world examples and rich illustrations make the book accessible to a broad interdisciplinary audience from students, to experts in the field, to practitioners in data-intensive application domains. Features: Dedicated to the synthesis of visual, interactive, and analysis methods Systematic top-down view on visualization, interaction, and automatic analysis Broad coverage of fundamental and advanced visualization techniques Comprehensive chapter on interacting with visual representations Extensive integration of automatic computational methods Accessible portrayal of cutting-edge visual analytics technology Foreword by Jack van Wijk For more information, you can also visit the author website, where the book's figures will be made available under the CC BY Open Access license: https://ivda-book.de/
Call Number: Main Circulating Stacks QA76.9.I52 T66 2020
ISBN: 9780367898755
Publication Date: 2020-04-20
Practical Data Science for Information Professionals by David StuartPractical Data Science for Information Professionals provides an accessible introduction to a potentially complex field, providing readers with an overview of data science and a framework for its application. It provides detailed examples and analysis on real data sets to explore the basics of the subject in three principle areas: clustering and social network analysis; predictions and forecasts; and text analysis and mining. As well as highlighting a wealth of user-friendly data science tools, the book also includes some example code in two of the most popular programming languages (R and Python) to demonstrate the ease with which the information professional can move beyond the graphical user interface and achieve significant analysis with just a few lines of code. After reading, readers will understand: · the growing importance of data science · the role of the information professional in data science · some of the most important tools and methods that information professionals can use. Bringing together the growing importance of data science and the increasing role of information professionals in the management and use of data, Practical Data Science for Information Professionals will provide a practical introduction to the topic specifically designed for the information community. It will appeal to librarians and information professionals all around the world, from large academic libraries to small research libraries. By focusing on the application of open source software, it aims to reduce barriers for readers to use the lessons learned within.
Call Number: Online Book
ISBN: 9781783303441
Publication Date: 2020-07-24
Practical Python Data Visualization: A Fast Track Approach To Learning Data Visualization With Python by Ashwin PajankarQuickly start programming with Python 3 for data visualization with this step-by-step, detailed guide. This book's programming-friendly approach using libraries such as leather, NumPy, Matplotlib, and Pandas will serve as a template for business and scientific visualizations. You'll begin by installing Python 3, see how to work in Jupyter notebook, and explore Leather, Python's popular data visualization charting library. You'll also be introduced to the scientific Python 3 ecosystem and work with the basics of NumPy, an integral part of that ecosystem. Later chapters are focused on various NumPy routines along with getting started with Scientific Data visualization using matplotlib. You'll review the visualization of 3D data using graphs and networks and finish up by looking at data visualization with Pandas, including the visualization of COVID-19 data sets. The code examples are tested on popular platforms like Ubuntu, Windows, and Raspberry Pi OS. With Practical Python Data Visualization you'll master the core concepts of data visualization with Pandas and the Jupyter notebook interface. What You'll Learn Review practical aspects of Python Data Visualization with programming-friendly abstractions Install Python 3 and Jupyter on multiple platforms including Windows, Raspberry Pi, and Ubuntu Visualize COVID-19 data sets with Pandas Who This Book Is For Data Science enthusiasts and professionals, Business analysts and managers, software engineers, data engineers.
Call Number: Online Book
ISBN: 9781484264546
Publication Date: 2020-10-25
R Visualizations: Derive Meaning from Data by David GerbingR Visualizations: Derive Meaning from Data focuses on one of the two major topics of data analytics: data visualization, a.k.a., computer graphics. In the book, major R systems for visualization are discussed, organized by topic and not by system. Anyone doing data analysis will be shown how to use R to generate any of the basic visualizations with the R visualization systems. Further, this book introduces the author's lessR system, which always can accomplish a visualization with less coding than the use of other systems, sometimes dramatically so, and also provides accompanying statistical analyses. Key Features Presents thorough coverage of the leading R visualization system, ggplot2. Gives specific guidance on using base R graphics to attain visualizations of the same quality as those provided by ggplot2. Shows how to create a wide range of data visualizations: distributions of categorical and continuous variables, many types of scatterplots including with a third variable, time series, and maps. Inclusion of the various approaches to R graphics organized by topic instead of by system. Presents the recent work on interactive visualization in R. David W. Gerbing received his PhD from Michigan State University in 1979 in quantitative analysis, and currently is a professor of quantitative analysis in the School of Business at Portland State University. He has published extensively in the social and behavioral sciences with a focus on quantitative methods. His lessR package has been in development since 2009.