0%

Book Description

Add a touch of data analytics to your healthcare systems and get insightful outcomes

Key Features

  • Perform healthcare analytics with Python and SQL
  • Build predictive models on real healthcare data with pandas and scikit-learn
  • Use analytics to improve healthcare performance

Book Description

In recent years, machine learning technologies and analytics have been widely utilized across the healthcare sector. Healthcare Analytics Made Simple bridges the gap between practising doctors and data scientists. It equips the data scientists' work with healthcare data and allows them to gain better insight from this data in order to improve healthcare outcomes.

This book is a complete overview of machine learning for healthcare analytics, briefly describing the current healthcare landscape, machine learning algorithms, and Python and SQL programming languages. The step-by-step instructions teach you how to obtain real healthcare data and perform descriptive, predictive, and prescriptive analytics using popular Python packages such as pandas and scikit-learn. The latest research results in disease detection and healthcare image analysis are reviewed.

By the end of this book, you will understand how to use Python for healthcare data analysis, how to import, collect, clean, and refine data from electronic health record (EHR) surveys, and how to make predictive models with this data through real-world algorithms and code examples.

What you will learn

  • Gain valuable insight into healthcare incentives, finances, and legislation
  • Discover the connection between machine learning and healthcare processes
  • Use SQL and Python to analyze data
  • Measure healthcare quality and provider performance
  • Identify features and attributes to build successful healthcare models
  • Build predictive models using real-world healthcare data
  • Become an expert in predictive modeling with structured clinical data
  • See what lies ahead for healthcare analytics

Who this book is for

Healthcare Analytics Made Simple is for you if you are a developer who has a working knowledge of Python or a related programming language, although you are new to healthcare or predictive modeling with healthcare data. Clinicians interested in analytics and healthcare computing will also benefit from this book. This book can also serve as a textbook for students enrolled in an introductory course on machine learning for healthcare.

Table of Contents

  1. Title Page
  2. Copyright and Credits
    1. Healthcare Analytics Made Simple
  3. Dedication
  4. Packt Upsell
    1. Why subscribe?
    2. PacktPub.com
  5. Foreword
  6. Contributors
    1. About the author
    2. About the reviewer
    3. Packt is searching for authors like you
  7. Preface
    1. Who this book is for
    2. What this book covers
    3. To get the most out of this book
      1. Download the example code files
      2. Download the color images
      3. Conventions used
    4. Get in touch
      1. Reviews
  8. Introduction to Healthcare Analytics
    1. What is healthcare analytics?
      1. Healthcare analytics uses advanced computing technology
      2. Healthcare analytics acts on the healthcare industry (DUH!)
      3. Healthcare analytics improves medical care
        1. Better outcomes
        2. Lower costs
        3. Ensure quality
    2. Foundations of healthcare analytics
      1. Healthcare
      2. Mathematics
      3. Computer science
    3. History of healthcare analytics
    4. Examples of healthcare analytics
      1. Using visualizations to elucidate patient care
      2. Predicting future diagnostic and treatment events
      3. Measuring provider quality and performance
      4. Patient-facing treatments for disease
    5. Exploring the software
      1. Anaconda
        1. Anaconda navigator
        2. Jupyter notebook
        3. Spyder IDE
      2. SQLite
      3. Command-line tools
      4. Installing a text editor
    6. Summary
    7. References
  9. Healthcare Foundations
    1. Healthcare delivery in the US
      1. Healthcare industry basics
      2. Healthcare financing
        1. Fee-for-service reimbursement
        2. Value-based care
      3. Healthcare policy
        1. Protecting patient privacy and patient rights
        2. Advancing the adoption of electronic medical records
        3. Promoting value-based care
        4. Advancing analytics in healthcare
    2. Patient data – the journey from patient to computer
      1. The history and physical (H&P)
        1. Metadata and chief complaint
        2. History of the present illness (HPI)
        3. Past medical history
        4. Medications
        5. Family history
        6. Social history
        7. Allergies
        8. Review of systems
        9. Physical examination
        10. Additional objective data (lab tests, imaging, and other diagnostic tests)
        11. Assessment and plan
      2. The progress (SOAP) clinical note
    3. Standardized clinical codesets
      1. International Classification of Disease (ICD)
      2. Current Procedural Terminology (CPT)
      3. Logical Observation Identifiers Names and Codes (LOINC)
      4. National Drug Code (NDC)
      5. Systematized Nomenclature of Medicine Clinical Terms (SNOMED-CT)
    4. Breaking down healthcare analytics
      1. Population
      2. Medical task
        1. Screening
        2. Diagnosis
        3. Outcome/Prognosis
        4. Response to treatment
      3. Data format
        1. Structured
        2. Unstructured
        3. Imaging
        4. Other data format
      4. Disease
        1. Acute versus chronic diseases
        2. Cancer
        3. Other diseases
      5. Putting it all together – specifying a use case
    5. Summary
    6. References and further reading
  10. Machine Learning Foundations
    1. Model frameworks for medical decision making
      1. Tree-like reasoning
        1. Categorical reasoning with algorithms and trees
        2. Corresponding machine learning algorithms – decision tree and random forest
      2. Probabilistic reasoning and Bayes theorem
        1. Using Bayes theorem for calculating clinical probabilities
          1. Calculating the baseline MI probability
          2. 2 x 2 contingency table for chest pain and myocardial infarction
          3. Interpreting the contingency table and calculating sensitivity and specificity
          4. Calculating likelihood ratios for chest pain (+ and -)
          5. Calculating the post-test probability of MI given the presence of chest pain
        2. Corresponding machine learning algorithm – the Naive Bayes Classifier
      3. Criterion tables and the weighted sum approach
        1. Criterion tables
        2. Corresponding machine learning algorithms – linear and logistic regression
      4. Pattern association and neural networks
        1. Complex clinical reasoning
        2. Corresponding machine learning algorithm – neural networks and deep learning
    2. Machine learning pipeline
      1. Loading the data
      2. Cleaning and preprocessing the data
        1. Aggregating data
        2. Parsing data
        3. Converting types
        4. Dealing with missing data
      3. Exploring and visualizing the data
      4. Selecting features
      5. Training the model parameters
      6. Evaluating model performance
        1. Sensitivity (Sn)
        2. Specificity (Sp)
        3. Positive predictive value (PPV)
        4. Negative predictive value (NPV)
        5. False-positive rate (FPR)
        6. Accuracy (Acc)
        7. Receiver operating characteristic (ROC) curves
        8. Precision-recall curves
        9. Continuously valued target variables
    3. Summary
    4. References and further reading
  11. Computing Foundations – Databases
    1. Introduction to databases
    2. Data engineering with SQL – an example case
    3. Case details – predicting mortality for a cardiology practice
      1. The clinical database
        1. The PATIENT table
        2. The VISIT table
        3. The MEDICATIONS table
        4. The LABS table
        5. The VITALS table
        6. The MORT table
    4. Starting an SQLite session
    5. Data engineering, one table at a time with SQL
      1. Query Set #0 – creating the six tables
        1. Query Set #0a – creating the PATIENT table
        2. Query Set #0b – creating the VISIT table
        3. Query Set #0c – creating the MEDICATIONS table
        4. Query Set #0d – creating the LABS table
        5. Query Set #0e – creating the VITALS table
        6. Query Set #0f – creating the MORT table
        7. Query Set #0g – displaying our tables
      2. Query Set #1 – creating the MORT_FINAL table
      3. Query Set #2 – adding columns to MORT_FINAL
        1. Query Set #2a – adding columns using ALTER TABLE
        2. Query Set #2b – adding columns using JOIN
      4. Query Set #3 – date manipulation – calculating age
      5. Query Set #4 – binning and aggregating diagnoses
        1. Query Set #4a – binning diagnoses for CHF
        2. Query Set #4b – binning diagnoses for other diseases
        3. Query Set #4c – aggregating cardiac diagnoses using SUM
        4. Query Set #4d – aggregating cardiac diagnoses using COUNT
      6. Query Set #5 – counting medications
      7. Query Set #6 – binning abnormal lab results
      8. Query Set #7 – imputing missing variables
        1. Query Set #7a – imputing missing temperature values using normal-range imputation
        2. Query Set #7b – imputing missing temperature values using mean imputation
        3. Query Set #7c – imputing missing BNP values using a uniform distribution
      9. Query Set #8 – adding the target variable
      10. Query Set #9 – visualizing the MORT_FINAL_2 table
    6. Summary
    7. References and further reading
  12. Computing Foundations – Introduction to Python
    1. Variables and types
      1. Strings
      2. Numeric types
    2. Data structures and containers
      1. Lists
      2. Tuples
      3. Dictionaries
      4. Sets
    3. Programming in Python – an illustrative example
    4. Introduction to pandas
      1. What is a pandas DataFrame?
      2. Importing data
        1. Importing data into pandas from Python data structures
        2. Importing data into pandas from a flat file
        3. Importing data into pandas from a database
      3. Common operations on DataFrames
        1. Adding columns
          1. Adding blank or user-initialized columns
          2. Adding new columns by transforming existing columns
        2. Dropping columns
        3. Applying functions to multiple columns
        4. Combining DataFrames
        5. Converting DataFrame columns to lists
        6. Getting and setting DataFrame values
          1. Getting/setting values using label-based indexing with loc
          2. Getting/setting values using integer-based labeling with iloc
          3. Getting/setting multiple contiguous values using slicing
          4. Fast getting/setting of scalar values using at and iat
        7. Other operations
          1. Filtering rows using Boolean indexing
          2. Sorting rows
        8. SQL-like operations
          1. Getting aggregate row COUNTs
          2. Joining DataFrames
    5. Introduction to scikit-learn
      1. Sample data
      2. Data preprocessing
        1. One-hot encoding of categorical variables
        2. Scaling and centering
        3. Binarization
        4. Imputation
      3. Feature-selection
      4. Machine learning algorithms
        1. Generalized linear models
        2. Ensemble methods
        3. Additional machine learning algorithms
      5. Performance assessment
    6. Additional analytics libraries
      1. NumPy and SciPy
      2. matplotlib
    7. Summary
  13. Measuring Healthcare Quality
    1. Introduction to healthcare measures
    2. US Medicare value-based programs
    3. The Hospital Value-Based Purchasing (HVBP) program
      1. Domains and measures
        1. The clinical care domain
        2. The patient- and caregiver-centered experience of care domain
        3. Safety domain
        4. Efficiency and cost reduction domain
    4. The Hospital Readmission Reduction (HRR) program
    5. The Hospital-Acquired Conditions (HAC) program
      1. The healthcare-acquired infections domain
      2. The patient safety domain
    6. The End-Stage Renal Disease (ESRD) quality incentive program
    7. The Skilled Nursing Facility Value-Based Program (SNFVBP)
    8. The Home Health Value-Based Program (HHVBP)
    9. The Merit-Based Incentive Payment System (MIPS)
      1. Quality
      2. Advancing care information
      3. Improvement activities
      4. Cost
    10. Other value-based programs
      1. The Healthcare Effectiveness Data and Information Set (HEDIS)
      2. State measures
    11. Comparing dialysis facilities using Python
      1. Downloading the data
      2. Importing the data into your Jupyter Notebook session
      3. Exploring the data rows and columns
      4. Exploring the data geographically
      5. Displaying dialysis centers based on total performance
      6. Alternative analyses of dialysis centers
    12. Comparing hospitals
      1. Downloading the data
      2. Importing the data into your Jupyter Notebook session
      3. Exploring the tables
      4. Merging the HVBP tables
    13. Summary
    14. References
  14. Making Predictive Models in Healthcare
    1. Introduction to predictive analytics in healthcare
    2. Our modeling task – predicting discharge statuses for ED patients
    3. Obtaining the dataset
      1. The NHAMCS dataset at a glance
      2. Downloading the NHAMCS data
        1. Downloading the ED2013 file
        2. Downloading the list of survey items – body_namcsopd.pdf
        3. Downloading the documentation file – doc13_ed.pdf
    4. Starting a Jupyter session
    5. Importing the dataset
      1. Loading the metadata
      2. Loading the ED dataset
    6. Making the response variable
    7. Splitting the data into train and test sets
    8. Preprocessing the predictor variables
      1. Visit information
        1. Month
        2. Day of the week
        3. Arrival time
        4. Wait time
        5. Other visit information
      2. Demographic variables
        1. Age
        2. Sex
        3. Ethnicity and race
        4. Other demographic information
      3. Triage variables
      4. Financial variables
      5. Vital signs
        1. Temperature
        2. Pulse
        3. Respiratory rate
        4. Blood pressure
        5. Oxygen saturation
        6. Pain level
      6. Reason-for-visit codes
      7. Injury codes
      8. Diagnostic codes
      9. Medical history
      10. Tests
      11. Procedures
      12. Medication codes
      13. Provider information
      14. Disposition information
      15. Imputed columns
      16. Identifying variables
      17. Electronic medical record status columns
      18. Detailed medication information
      19. Miscellaneous information
    9. Final preprocessing steps
      1. One-hot encoding
      2. Numeric conversion
      3. NumPy array conversion
    10. Building the models
      1. Logistic regression
      2. Random forest
      3. Neural network
    11. Using the models to make predictions
    12. Improving our models
    13. Summary
    14. References and further reading
  15. Healthcare Predictive Models – A Review
    1. Predictive healthcare analytics – state of the art
    2. Overall cardiovascular risk
      1. The Framingham Risk Score
      2. Cardiovascular risk and machine learning
    3. Congestive heart failure
      1. Diagnosing CHF
      2. CHF detection with machine learning
      3. Other applications of machine learning in CHF
    4. Cancer
      1. What is cancer?
      2. ML applications for cancer
      3. Important features of cancer
        1. Routine clinical data
        2. Cancer-specific clinical data
        3. Imaging data
        4. Genomic data
        5. Proteomic data
      4. An example – breast cancer prediction
        1. Traditional screening of breast cancer
        2. Breast cancer screening and machine learning
    5. Readmission prediction
      1. LACE and HOSPITAL scores
      2. Readmission modeling
    6. Other conditions and events
    7. Summary
    8. References and further reading
  16. The Future – Healthcare and Emerging Technologies
    1. Healthcare analytics and the internet
      1. Healthcare and the Internet of Things
      2. Healthcare analytics and social media
        1. Influenza surveillance and forecasting
        2. Predicting suicidality with machine learning
    2. Healthcare and deep learning
      1. What is deep learning, briefly?
      2. Deep learning in healthcare
        1. Deep feed-forward networks
        2. Convolutional neural networks for images
        3. Recurrent neural networks for sequences
    3. Obstacles, ethical issues, and limitations
      1. Obstacles
      2. Ethical issues
      3. Limitations
    4. Conclusion of this book
    5. References and further reading
  17. Other Books You May Enjoy
    1. Leave a review - let other readers know what you think