Julie von Neumann
Headshot

Julie von Neumann

Data Science & Analytics Professional

ABOUT

Julie von Neumann is a data science and analytics professional with a strong foundation in statistics, machine learning, and large-scale data systems. Her work spans healthcare analytics, anomaly and fraud detection, and complex modeling environments where precision, scalability, and reliability are essential. She specializes in transforming high-dimensional data into actionable intelligence that informs strategic decision-making and improves operational performance. Her approach emphasizes the design of robust analytical frameworks and production-ready models that deliver measurable, real-world impact.

Her expertise includes the development of scalable machine learning pipelines, advanced anomaly detection systems, and cohort stratification methodologies for diverse and complex datasets. She places particular emphasis on model performance, interpretability, and long-term sustainability, ensuring that analytical systems remain reliable as data and organizational needs evolve.

Julie works closely with engineering teams and business stakeholders to align technical implementation with strategic objectives, bridging the gap between quantitative analysis and practical application. Her work is grounded in intellectual rigor, structured reasoning, and a commitment to building analytical solutions that are both technically sound and operationally meaningful.

EXPERIENCE

07/2024 – Present
Data Science Manager Weisheit Holdings

Oversees analytical and quantitative risk assessment for private trust portfolios, supporting fiduciary decision-making, capital preservation, and long-term investment strategy. Develops advanced analytical frameworks to evaluate portfolio risk exposure, asset performance, and capital allocation across diversified financial instruments. Applies statistical modeling, distributed data processing, and machine learning techniques to identify emerging risk patterns, monitor portfolio stability, and improve transparency of investment performance. Builds scalable data infrastructure using Python, Spark, and PySpark to enable efficient analysis of large-scale financial data. Provides analytical insight that strengthens governance, enhances risk visibility, and supports disciplined portfolio management in a fiduciary trust environment.

  • Developed and maintained quantitative risk assessment frameworks to evaluate portfolio exposure, volatility, drawdown risk, and capital stability across diversified private trust investment portfolios.
  • Designed scalable analytical pipelines using Python, PySpark, and Spark to process and analyze large-scale financial and portfolio data, improving efficiency, reproducibility, and analytical depth.
  • Built statistical and machine learning models to identify emerging risk patterns, detect anomalies in portfolio performance, and support proactive risk mitigation and capital preservation strategies.
  • Performed multi-factor portfolio analysis to assess asset behavior, correlation structures, and sensitivity to market conditions, enabling more informed capital allocation and investment oversight.
  • Created automated reporting and monitoring systems to track portfolio performance, risk indicators, and key financial metrics, improving transparency and supporting fiduciary governance requirements.
  • Collaborated with trust administrators and investment decision-makers to translate quantitative analysis into actionable insights, strengthening risk management practices and long-term portfolio stability.
01/2023 – 10/2023
Senior Data Scientist CVS Health (Aetna)

Developed advanced analytical and machine learning models supporting behavioral health and suicide prevention initiatives, focusing on early identification of high-risk individuals within younger populations. Integrated healthcare claims data, third-party behavioral interaction data, and clinical notes to construct risk scoring models and patient risk profiles. Applied natural language processing and predictive analytics to identify patterns associated with depression, bipolar disorder, and suicide risk, enabling earlier intervention and improving the effectiveness of preventative care strategies. Work also evaluated the clinical and financial impact of behavioral health conditions, supporting data-driven decision-making and resource allocation within population health management programs.

  • Developed predictive risk models to identify individuals at elevated risk of suicide and severe behavioral health outcomes, integrating healthcare claims data, third-party behavioral interaction data, and clinical engagement metrics.
  • Applied natural language processing (NLP) techniques to analyze behavioral specialist notes and patient interaction transcripts, extracting clinically relevant signals to enhance risk stratification and early intervention modeling.
  • Designed and implemented patient risk scoring frameworks to quantify behavioral health risk levels, enabling proactive identification and prioritization of high-risk individuals within population health management programs.
  • Analyzed relationships between behavioral health conditions, including depression and bipolar disorder, and clinical outcomes such as hospitalization frequency, comorbidities, and suicide risk progression.
  • Evaluated the financial and clinical impact of behavioral health conditions by analyzing cost-of-care patterns, healthcare utilization, and treatment trajectories across diverse patient cohorts.
  • Engineered analytical workflows and integrated multi-source datasets to support scalable behavioral health analytics, improving model performance, data reliability, and decision-making effectiveness.
04/2022 – 08/2022
Principal Data Scientist Cohere Health

Directed machine learning and analytics initiatives supporting an automated prior authorization platform designed to improve clinical decision efficiency and reduce administrative burden. Led a team of data scientists in analyzing large-scale authorization and clinical datasets using Spark and PySpark to identify model deficiencies, improve decision accuracy, and enhance system reliability. Applied advanced statistical methods and natural language processing to optimize model inputs, reduce false flags, and improve automation rates. Oversaw analytical strategy, model evaluation, and performance monitoring to ensure scalable, production-ready solutions within a high-volume healthcare environment.

  • Led a team of data scientists responsible for developing and optimizing machine learning models supporting automated prior authorization workflows, improving system accuracy and reducing manual clinical review requirements.
  • Directed analysis of large-scale healthcare authorization and clinical datasets using Apache Spark and PySpark, identifying model deficiencies, data quality issues, and opportunities for performance improvement.
  • Designed and implemented machine learning models and statistical methodologies to improve decision precision, reduce false positive flags, and increase overall automation rates.
  • Applied natural language processing (NLP) techniques to analyze clinical notes and authorization documentation, extracting structured features to enhance predictive model performance.
  • Established model performance monitoring frameworks and analytical reporting systems to track automation efficiency, decision accuracy, and operational impact.
  • Collaborated with engineering and product teams to translate analytical findings into system improvements, contributing to enhanced scalability, reliability, and operational effectiveness of the authorization platform.
11/2020 – 04/2022
Senior Data Scientist General Dynamics Information Technology — CMS Fraud Prevention System

Contributed to the design, development, and enhancement of advanced statistical and machine learning models supporting the Centers for Medicare & Medicaid Services (CMS) Fraud Prevention System, a national platform responsible for identifying fraud, waste, and abuse across Medicare and Medicaid programs. Developed predictive and anomaly detection frameworks capable of analyzing large-scale, highly imbalanced healthcare claims data to identify suspicious billing behavior and emerging fraud patterns. Work supported both pre-payment intervention models and post-payment investigative analytics, strengthening program integrity and enabling data-driven enforcement actions within CMS’s Center for Program Integrity.

  • Designed and implemented advanced anomaly detection frameworks to identify fraudulent and anomalous billing patterns within large-scale Medicare and Medicaid claims datasets, utilizing unsupervised learning techniques including Isolation Forest, Local Outlier Factor, One-Class SVM, and Robust Covariance.
  • Developed and optimized supervised machine learning models for fraud detection using algorithms such as Random Forest, Gradient Boosting, Logistic Regression, and Support Vector Machines, significantly improving detection accuracy and investigative targeting efficiency.
  • Engineered predictive modeling solutions supporting both pre-payment intervention and post-payment investigative workflows, enabling earlier detection of high-risk claims and strengthening fraud prevention capabilities within CMS’s Center for Program Integrity.
  • Analyzed massive, high-dimensional healthcare datasets to identify behavioral patterns, anomalous provider activity, and emerging fraud trends, applying advanced statistical methods and exploratory data analysis techniques.
  • Modernized legacy analytical infrastructure by migrating predictive models and statistical workflows from SAS and SPSS environments into scalable Python and Spark-based systems, improving performance, maintainability, and long-term scalability.
  • Developed automated analytical pipelines for data ingestion, preprocessing, feature engineering, model scoring, and reporting, improving operational efficiency, reproducibility, and integration with enterprise fraud prevention systems.
05/2019 – 05/2020
Senior Healthcare Economics and Outcomes Research Consultant Optum (United Health Group)

Evaluated the economic impact of clinical programs supporting Optum’s vertically integrated care delivery model, focusing on optimizing site-of-care utilization and reducing unnecessary hospital expenditures. Analyzed large-scale claims, membership, and provider datasets to identify opportunities to shift procedures from high-cost hospital settings to Optum care centers, improving cost efficiency while maintaining quality of care. Developed risk, utilization, and cost-of-care metrics to assess provider and patient behavior, quantify financial impact, and support value-based care initiatives. Delivered analytical insights, scoring methodologies, and executive-level reporting that informed network strategy, provider performance evaluation, and operational decision-making.

  • Analyzed large-scale healthcare claims, membership, and provider datasets using SAS to evaluate cost-of-care, utilization patterns, and economic impact of clinical programs across Optum’s integrated care network.
  • Developed advanced cost, utilization, and risk-based metrics, including per-member-per-month (PMPM) indicators and provider performance measures, to identify opportunities for reducing unnecessary hospital utilization.
  • Designed patient and provider scoring methodologies to quantify cost impact, utilization efficiency, and adherence to site-of-care optimization strategies, supporting value-based care and cost containment initiatives.
  • Built analytical dashboards and reporting tools to evaluate geographic accessibility of Optum care centers, enabling assessment of patient site-of-care decisions and identifying opportunities to redirect procedures to lower-cost care settings.
  • Performed cohort analyses and episode-of-care evaluations using claims and ETG (Episode Treatment Group) methodologies to assess clinical program effectiveness and financial outcomes.
  • Delivered data-driven insights and strategic recommendations to clinical leadership, medical directors, and executive stakeholders, supporting provider network optimization and enterprise cost reduction initiatives.
05/2013 – 05/2019
Senior Data Analyst Horizon Blue Cross Blue Shield

Developed advanced statistical and predictive modeling frameworks to evaluate provider performance, benchmark healthcare costs, and support value-based care initiatives within Patient-Centered Medical Home (PCMH) and Accountable Care Organization (ACO) programs. Designed and implemented risk assessment and cost-of-care models using large-scale claims and clinical datasets, applying both classical statistical techniques and machine learning algorithms. Work emphasized methodological rigor, automation, and scalable analytical workflows to enable reliable performance measurement, improve financial forecasting, and strengthen data-driven decision-making across enterprise healthcare initiatives.

  • Analyzed large-scale healthcare claims and operational datasets using advanced descriptive statistics and exploratory analysis to support Patient-Centered Medical Home (PCMH) and Accountable Care Organization (ACO) initiatives.
  • Designed and implemented statistical and predictive models to evaluate provider performance and benchmark cost of care, leveraging Episode Treatment Group (ETG) methodologies to ensure accurate and standardized comparisons.
  • Developed healthcare risk assessment models by analyzing cost distributions, constructing scoring frameworks, and applying ranking methodologies to evaluate and improve model performance.
  • Built predictive models using a wide range of machine learning and statistical algorithms, including Generalized Linear Models (GLM), Logistic Regression, Ridge, Lasso, K-Nearest Neighbors, and Decision Trees, supporting cost optimization and performance improvement programs.
  • Engineered automated analytical pipelines for data preprocessing, model scoring, validation, and reporting, significantly improving efficiency, reproducibility, and scalability of analytical workflows.
  • Established robust data preparation methodologies to address missing data, skewed distributions, and data quality challenges, applying appropriate imputation, normalization, and transformation techniques to ensure model reliability.
11/2007 – 05/2013
Actuarial Consultant MetLife Corp

Developed and validated actuarial reserve models and analytical frameworks supporting long-term care and critical illness insurance portfolios, ensuring accurate financial reporting and regulatory compliance. Designed analytical tools and validation methodologies to monitor reserve adequacy, identify data anomalies, and evaluate model performance across complex, multi-dimensional insurance datasets. Work supported actuarial valuation, financial forecasting, and enterprise risk management processes, strengthening the reliability and transparency of reserve calculations and contributing to sound financial governance.

  • Developed and validated actuarial reserve models supporting Long-Term Care (LTC) and Critical Illness (CI) insurance portfolios, ensuring accuracy, financial integrity, and regulatory compliance.
  • Monitored and analyzed actuarial reserve trends across multiple business segments, producing reserve roll-forward analyses, valuation reports, and formal actuarial sign-offs for monthly and quarterly financial reporting.
  • Engineered analytical tools and automated workflows using SQL, VBA, and relational database systems to support actuarial modeling, reserve validation, and enterprise risk assessment.
  • Performed comprehensive validation and scenario testing of actuarial reserve models, analyzing model sensitivity, identifying discrepancies, and improving reliability of reserve calculations.
  • Investigated and resolved data quality and integrity issues affecting actuarial models, implementing corrective strategies and improving data validation processes to prevent future discrepancies.
  • Collaborated with actuarial, IT, and finance teams to support data migration, regression testing, and implementation of business rules, ensuring consistency and accuracy across production and analytical systems.

SKILLS

Development Environments

  • Jupyter Notebook
  • Google Colab
  • Apache Zeppelin
  • EMR Studio
  • VS Code
  • PyCharm
  • Spyder

Big Data & Cloud Platforms

  • Hadoop
  • Hive
  • Spark
  • Databricks
  • AWS
  • Azure
  • GCP

Languages & Scripting

  • Python
  • PySpark
  • SQL
  • SAS
  • Spark RDD API
  • jQuery
  • HTML/CSS

Tools & Applications

  • SAS Enterprise Guide
  • Tableau
  • Looker
  • Docker
  • GitLab
  • GitHub
  • BigQuery

KEY ACHIEVEMENTS

  • Architected and deployed advanced predictive modeling and anomaly detection frameworks across healthcare, insurance, and financial risk domains, leveraging supervised and unsupervised machine learning techniques to identify fraud, assess risk exposure, and improve operational decision-making at enterprise scale.
  • Developed large-scale analytical pipelines using Python, PySpark, and distributed Spark environments to process and analyze high-volume medical claims, behavioral health, and financial portfolio data, significantly improving analytical efficiency, model scalability, and data reliability.
  • Designed and implemented statistical risk assessment models for healthcare cost prediction, suicide risk stratification, actuarial reserve validation, and investment portfolio risk evaluation, strengthening institutional risk governance and improving outcome predictability.
  • Contributed to federal healthcare integrity initiatives by developing fraud detection models supporting CMS Fraud Prevention System (FPS) programs, enabling early identification of anomalous provider billing behavior and supporting prevention of fraud, waste, and abuse within Medicare and Medicaid systems.
  • Applied advanced analytical methodologies, cohort stratification, and multi-factor predictive modeling, to integrate clinical notes, claims data, and behavioral interaction data, improving detection of high-risk populations and enabling targeted intervention strategies.
  • Modernized legacy analytical ecosystems by migrating statistical and actuarial models from SAS and SPSS platforms into scalable Python and Spark architectures, enhancing performance, reproducibility, and long-term sustainability of enterprise analytics infrastructure.