From Brushstroke to Meaning: The Fluid Art of Data Science

Published: April 15, 2025

By Amy Humke, Ph.D.
Founder, Critical Influence

Dance


When I’m deep in data modeling, it doesn’t feel like programming, it feels like art. Like painting a ballerina in mid-twirl, I’m capturing something dynamic, incomplete, and full of potential. There’s structure, yes, but there’s also freedom, room to explore, revise, and intuit. Data science may have “science” in the name, but the process is rarely linear. It’s creative. It’s interpretive. And just like with my artwork, I begin with a mess of motion and color, and work toward something that holds meaning, something that brings clarity, even beauty, to complexity.

The most creative and art-like aspect of data science, I believe, happens in feature identification and engineering. Yes, there are other parts of the process that require creativity as well, but feature engineering is where you’re most often pushed to see beyond your own perspective. And just like in art, there is often that point of pure frustration when you feel you’ve hit a wall, the model is not hitting target and what other data point could possibly be out there (that you have access to include) that will make the model work? If you hit that point, take a step back and follow these steps toward creativity.

Let’s Get Creative

Before diving into specifics, prepare yourself to adopt an open mind and banish the negative. Start from a point where even the outlandish could be possible. Nothing is off the table, and there are no stupid thoughts, suggestions, or questions… just unexplored possibilities.


Creative Approaches to Feature Finding

In-Depth Exploration

This one isn't flashy, but it’s foundational. Take time to reacquaint yourself with your data warehouse: skim the data dictionaries, table names, and explore what's new. Don’t assume you already know what’s in there—schemas evolve.

Capitalize on Domain Expertise

Sometimes, the best features aren’t found—they’re remembered by someone who understands the problem deeply.

Example: In enrollment modeling, an admissions manager might say, “students who reschedule coaching calls more than once rarely show up to the third attempt.” That insight leads to a call rescheduling volatility feature—something metadata alone wouldn’t reveal.

Visual Thinking

When in doubt, draw it out.

Graphs, mind maps, Sankey diagrams, and heatmaps aren’t just for presentations—they help uncover trends hidden in raw numbers.

Example: Plotting time-to-enrollment by marketing channel may reveal that one channel leads to slower but more consistent conversions, prompting exploration of interaction terms between time and channel.


Creative Techniques in Feature Engineering

Here’s a list of feature engineering strategies I’ve used in real-world projects. While the examples focus on structured behavioral and transactional data, the techniques apply broadly.

Simple Statistical Derived Features

Sometimes, the most valuable insights come from the simplest math:

Example: From call log data, create features like average call duration or number of weekend calls.

Feature Combination and Interaction

Combining existing features across time, geography, or profile segments can reveal non-obvious relationships.

Example: Crossing a student’s program with their region may reveal unique patterns. For categorical data, try ethnicity × income band or channel × contact frequency.

Feature Transformation

When a continuous feature is skewed, apply:

Example: Apply a log transformation to highly skewed purchase counts to stabilize variance.

Discretization (Binning)

Binning continuous variables can improve both model performance and interpretability.

Example: Create engagement bins like Low, Medium, and High for cleaner modeling.

Target Encoding

For high-cardinality categorical features, replace each category with the average target value for that group.

Handle carefully to avoid data leakage:
- Use out-of-fold encoding.
- Smooth with the global mean to reduce noise in low-frequency categories.

Recommended Read: "Target Encoding Done the Right Way" by Max Halford.


Time Series Feature Generation

When working with temporal or sequential data, engineered time-based features are often essential.

These capture recency and periodic patterns, often critical in behavioral prediction.


Take Care

Feature engineering is powerful—but don’t overdo it.

Be especially cautious about data leakage. Ask yourself, “Would I have known this at the time the prediction was made?” If not, it doesn’t belong in the model.


Conclusion

It’s easy to hit a wall in feature engineering when your model plateaus and you’re out of ideas. I hope these techniques help you push through, spark new ideas, and feel just a little more creative in the process.

I’ve often noticed how many people in data science also have artistic sides—musicians, painters, dancers, writers. Maybe it’s not a coincidence. I truly believe you need a spark of creativity to do this work well. After all, turning data into something meaningful isn’t just technical—it’s interpretive, fluid, and, yes, a little bit artistic.

What about you? Where do you see the art in your own data science practice?

← Back to Articles