1 min readfrom Data Science

Rescaling logistic regression predictions for under-sampled data?

I'm building a predictive model for a large dataset with a binary 0/1 outcome that is heavily imbalanced.

I'm under-sampling records from the majority outcome class (the 0s) in order to fit the data into my computer's memory prior to fitting a logistic regression model.

Because of the under-sampling, do I need to rescale the model's probability predictions when choosing the optimal threshold or is the scale arbitrary?

submitted by /u/RobertWF_47
[link] [comments]

Want to read more?

Check out the full article on the original site

View original article

Tagged with

#generative AI for data analysis
#Excel alternatives for data analysis
#large dataset processing
#natural language processing for spreadsheets
#big data management in spreadsheets
#conversational data analysis
#rows.com
#real-time data collaboration
#intelligent data visualization
#data visualization tools
#enterprise data management
#big data performance
#data analysis tools
#data cleaning solutions
#financial modeling with spreadsheets
#predictive analytics in spreadsheets
#predictive analytics