Entrobit TeamApril 13, 20267 min read

Generating Realistic Synthetic Time Series: From Financial Markets to IoT Sensor Streams

Published by Entrobit · April 2026

Your Tabular Generator Won't Work Here

CTGAN, TVAE, Gaussian Copula: they're excellent at generating synthetic records for static tabular data. But hand them a time series and they'll produce something useless.

The reason is simple. Tabular generators treat each row as an independent sample from a joint distribution. They learn marginals, correlations, and higher-order dependencies between columns, but they have zero concept of sequence. They can't capture the fact that a patient's blood pressure at 3:00 PM depends on the reading at 2:55 PM, or that a stock price at close is shaped by every trade during the session.

Time-series data carries structure that these models can't represent: autocorrelation, non-stationarity, regime shifts, seasonality, irregular sampling. Destroy any of these properties in your synthetic output and the downstream models that depend on them will fail.

This isn't a niche concern. Finance, healthcare, manufacturing, energy, logistics: most enterprises sit on time-series data, and the demand for realistic synthetic versions is growing fast.

Why the Demand Is Urgent

Financial markets. A fintech firm backtesting algorithmic trading strategies needs to stress-test against market conditions that haven't occurred yet. Synthetic market data has to preserve autocorrelation in returns, volatility clustering (GARCH effects), fat tails, and cross-asset correlations that shift during stress events. Get any of these wrong and the backtest is fiction.

IoT and predictive maintenance. A factory with hundreds of sensors might see a handful of equipment failures per year across its entire fleet. Anomaly detection models starved for positive examples perform terribly. Synthetic sensor streams with realistic degradation patterns, noise characteristics, and failure signatures can augment training data by orders of magnitude.

Clinical simulation. A synthetic patient whose blood pressure is independently sampled at each time point would show medically impossible volatility. Realistic synthetic vitals must preserve within-patient dynamics, between-patient heterogeneity, and clinically plausible responses to interventions.

Three Approaches That Work

Three architectural families have emerged as the serious contenders for time-series synthesis, all implemented in Synthcity.

TimeGAN

TimeGAN (Yoon, Jarrett, and van der Schaar, NeurIPS 2019) remains the most widely cited approach. The clever bit is a four-component architecture.

An autoencoder pair (embedding and recovery networks) maps original sequences into a lower-dimensional latent space. A GAN-style generator-discriminator pair operates in that latent space, producing synthetic trajectories. But here's what makes TimeGAN different from just "a GAN on sequences": it adds a supervised loss that explicitly trains the generator to predict the next timestep given preceding steps. This supervised component acts as a regularizer, pushing the generator toward temporally coherent outputs rather than relying on the adversarial signal alone.

Three losses are jointly optimized: reconstruction, adversarial, and supervised. In practice, TimeGAN preserves autocorrelation structure and inter-feature temporal dependencies reasonably well. Its weaknesses are the usual GAN weaknesses: training instability, difficulty with very long sequences, and trouble with highly non-stationary data.

Fourier Flows

Fourier Flows take a completely different angle by working in the frequency domain. Transform the time series via FFT, model the distribution of Fourier coefficients with normalizing flows, then inverse-transform back to time domain.

The appeal is that periodic patterns, seasonal components, and frequency-specific noise are all explicit in spectral space, making them easier to model. Normalizing flows also give you exact likelihood computation (GANs don't compute likelihoods at all; VAEs compute a lower bound), which enables principled model comparison.

Fourier Flows are strongest on data dominated by periodic or quasi-periodic behavior: sensor data with daily cycles, physiological signals like heart rate, intraday trading volume patterns. They struggle with abrupt regime changes, trend components, and aperiodic transients, because these aren't naturally expressed in spectral representations.

TimeVAE

TimeVAE adapts the variational autoencoder for sequences. Encoder maps input sequences to a latent distribution; decoder generates synthetic sequences from latent samples. Temporal dependencies are handled through recurrent layers (LSTM or GRU) in both encoder and decoder.

Like tabular TVAE, it offers stable training and reliable convergence. No adversarial balancing needed. The trade-off is also the same: a tendency toward over-smoothed outputs where the KL regularization suppresses sharp temporal features and rare events.

TimeVAE is a good default when training stability matters more than capturing every fine-grained temporal pattern, or when differential privacy is required (the ELBO works cleanly with DP-SGD).

On the Horizon: Diffusion for Time Series

Diffusion-based approaches for temporal data are active research, building on TabDDPM's success with tabular data. Temporal convolution networks or transformers replace the MLP backbone. Early results are promising for imputation and conditional forecasting, but for generating complete sequences from scratch, diffusion methods aren't yet as mature as TimeGAN or Fourier Flows. Give it a couple of years.

How to Evaluate Temporal Fidelity

Standard tabular metrics aren't enough for time series. You need metrics that specifically test whether temporal structure is preserved.

Discriminative score. Train a classifier (typically recurrent) to distinguish real from synthetic sequences. Test accuracy near 0.5 means the synthetic data is indistinguishable. Significantly higher means something's off. It's a holistic metric, but a black box: a bad score tells you there's a problem without telling you what.

Predictive score. Train a sequence model on synthetic data to forecast the next timestep, then evaluate on real data. Compare against a model trained on real data. The gap measures whether the synthetic data preserves the dynamics that drive forecasting. This is the time-series version of TSTR.

Autocorrelation and cross-correlation preservation. Compute ACF for each feature in both real and synthetic data. Compare them. Compute cross-correlation between feature pairs. These directly measure whether the temporal dependency structure survived synthesis.

Distributional metrics, done right. Kolmogorov-Smirnov, Wasserstein, Jensen-Shannon: they all still apply, but compute them both marginally (per feature, across all time points) and conditionally (per time point, across samples). Matching overall marginals isn't enough; the distribution at each timestep should also match.

Walking Through a Real Scenario

A fintech firm needs synthetic market data to stress-test a portfolio risk model. Five years of daily price series for 200 equities. About 1,250 timesteps per series. Cross-asset correlations that shift during market stress.

Profile first. Before picking a generator, characterize the temporal structure: autocorrelation order (AR), stationarity (ADF test), volatility clustering (ARCH effects), rolling cross-sectional correlation stability. This profile tells you which generator properties actually matter.

Select a generator. Strong volatility clustering, fat tails, and time-varying cross-asset correlations point toward TimeGAN as a reasonable first choice. Fourier Flows might complement it for periodic components like quarterly earnings effects.

Generate and evaluate. Run discriminative score, predictive score, ACF preservation, and tail distributional metrics (e.g., VaR at the 1st percentile). Compare against bootstrap resampling as a baseline.

Validate downstream. Train the risk model on synthetic data. Compare its VaR and CVaR estimates against the model trained on real data. This is the real test: does the synthetic data produce a risk model that makes similar decisions?

Assess privacy. If the synthetic data leaves the building, run nearest-neighbor distance analysis, membership inference, and (if DP was applied) report the formal ε guarantee.

The Production Challenges Nobody Warns You About

Model selection is only part of the story. Deploying time-series synthesis in production surfaces several less obvious problems.

Irregular sampling. Real sensor readings arrive at variable intervals. Patient visits are unpredictable. Most generators assume regular sampling, so you either resample to a grid (losing information) or use architectures designed for irregular sequences (still an active research area).

Variable-length sequences. Patients have different hospital stays. Sensors have different operational lifetimes. Generators need to handle this without padding artifacts distorting the output.

Multi-scale patterns. Many time series show structure at hourly, daily, weekly, and seasonal scales simultaneously. No single architecture captures all of these cleanly.

Conditional generation. You often don't want unconditional generation. You want synthetic vital signs for a patient with specific demographics and clinical history. Conditioning mechanisms need to be threaded into the temporal generation process.

Organizations that have gotten time-series synthesis working in production have generally adopted platforms that unify multiple temporal architectures, evaluation metrics, and privacy mechanisms in a single framework. The ability to swap generators based on data characteristics, evaluate against temporal fidelity criteria, and apply appropriate privacy guarantees, all without changing infrastructure, is what separates production systems from research notebooks.

What Comes Next

Time-series synthesis is maturing fast but still trails tabular synthesis in both algorithms and tooling. The next few years should bring serious progress in temporal diffusion models, better handling of irregular and multi-scale data, and tighter DP integration with temporal generators.

For organizations heavy on time-series data (that's most of finance, manufacturing, healthcare, and IoT), the window to establish robust generation and evaluation pipelines is now. Early adopters who build this infrastructure today will have a real edge as the technology catches up to the demand.

References: Yoon, Jarrett & van der Schaar (2019), Time-series Generative Adversarial Networks, NeurIPS; Alcaraz & Strodthoff (2023), Diffusion-based Time Series Imputation and Forecasting; Synthcity documentation.