API Reference
Droplet-Film Model Development Project — Technical Documentation.
Overview
This document provides technical documentation for classes, methods, and parameters in the DFT Development project. The API is designed to be both powerful and user-friendly, supporting research and industrial applications.
The project follows object-oriented design with clear separation between physics modeling, data management, and machine learning components.
Core Classes and Modules
The project consists of several key modules:
dft_model.py: Core physics model implementation
utils.py: Data management and utility functions
Individual Jupyter notebooks for different approaches
DFT Class — Core Physics Model
The DFT class implements the Droplet-Film Model for predicting critical flow rates in gas wells.
Class definition
class DFT:
"""
Droplet-Film Model for predicting critical flow rates in gas wells.
This class implements a physics-informed machine learning approach that combines
fundamental fluid dynamics principles with data-driven optimization to predict
when gas wells will experience liquid loading.
"""
Constructor
__init__(self, seed=42, feature_tol=1.0, dev_tol=1e-3, multiple_dev_policy="max")
Parameters:
seed (int): Random seed for reproducibility. Default: 42
feature_tol (float): Feature distance threshold for matching. Default: 1.0
dev_tol (float): Deviation tolerance for angle matching. Default: 1e-3
multiple_dev_policy (str): Policy for handling multiple matches. Options: “max”, “min”, “mean”, “median”. Default: “max”
Attributes: seed, feature_tol, dev_tol, multiple_dev_policy, opt_params (set after fitting), n_train (set after fitting).
Methods — fit
fit(self, X, y)
Train the DFT model on provided data.
Parameters: X (np.ndarray) shape (n_samples, 10), y (np.ndarray) shape (n_samples,). Returns: self.
Features (in order): Dia, Dev(deg), Area (m2), z, GasDens, LiquidDens, g (m/s2), P/T, friction_factor, critical_film_thickness.
Implementation: Uses Powell optimization from scipy.optimize; optimizes 5 global parameters (p1–p5) plus alpha per sample; bounds alpha in [0, 1]; max 5000 iterations, 10000 function calls.
Methods — predict
predict(self, X, dev_train=None, alpha_strategy='enhanced_dev_based')
Make predictions on new data.
Parameters: X (np.ndarray), optional dev_train, alpha_strategy (must be ‘enhanced_dev_based’). Returns: np.ndarray of shape (n_samples,).
Alpha assignment strategy (by well deviation angle):
Dev < 10°: Regular deviation-based matching
Find training samples within dev_tol
Apply multiple_dev_policy if multiple matches
Use mean training alpha if no matches
10° ≤ Dev < 20°: Minimum alpha strategy
Find training samples within dev_tol
Use minimum alpha among matches
Use mean training alpha if no matches
Dev ≥ 20°: Full-feature matching
Compute Euclidean distance to all training samples
Use closest sample’s alpha if distance < feature_tol
Use mean training alpha otherwise
Methods — _eq (physics equation)
_eq(self, params, X)
Compute predicted values using the physics equation.
Physics equation:
Where:
term1 involves \(2 g \\, \\mathrm{Dia}\), \((\\rho_l - \\rho_g)\), \(\\cos(\\mathrm{Dev})\), and parameters p4.
term2 involves \(|\\sin(p_5 \\cdot \\mathrm{Dev})|^{p_3}\) and \((\\rho_l - \\rho_g)^{p_2} / \\rho_g^2\).
Methods — _loss
_loss(self, params)
Compute loss function for optimization. Returns: float (MSE).
Helm Class — Data Management
The Helm class (in models.utils) handles dataset loading, train/test splitting, scaling, and model training/evaluation. The CSV must include the feature columns, plus Qcr (target), Gasflowrate, and Test status.
Class definition
from models.utils import Helm
Constructor
__init__(self, path, seed=42, drop_cols=None, includ_cols=None, test_size=0.20, scale=True)
Parameters: path (str), seed (int), drop_cols, includ_cols (lists), test_size (float, default 0.20), scale (bool, default True).
Attributes (set after initialization): X_train, X_test, y_train, y_test (numpy arrays). If scale=True, use X_train_rdy, X_test_rdy (and optionally y_train_rdy, y_test_rdy) for scaled data. Also: feature_names, scaler_X, scaler_y (when scale=True).
Methods — evolv_model
evolv_model(self, build_model, hparam_grid, k_folds=5)
Train model with hyperparameter optimization. Returns: best trained model. Performs grid search and k-fold cross-validation; stores predictions and metrics.
QLatticeWrapper Class — Symbolic Regression
Wrapper for Feyn QLattice for automated symbolic regression with a scikit-learn compatible interface.
Constructor
__init__(self, feature_tags, output_tag="Qcr", seed=42, max_complexity=10, n_epochs=10, criterion="bic")
Parameters: feature_tags (List[str]), output_tag, seed, max_complexity, n_epochs, criterion (“bic”, “aic”, “r2”).
Methods: fit(X, y), predict(X), express() (returns SymPy expression).
Data Format Requirements
Input CSV must contain the 10 features (Dia, Dev(deg), Area (m2), z, GasDens, LiquidDens, g (m/s2), P/T, friction_factor, critical_film_thickness) plus Qcr, Gasflowrate, and Test status for Helm.
Data Validation, Error Handling, Performance
Helm loads CSV and splits by stratify=loading (Test status). The API includes error handling for invalid inputs, missing columns, optimization failures, and (for QLattice) network issues.
Support and Resources
GitHub repository, documentation, community forums, issue tracker, and research papers. For examples, see Usage Examples and the Jupyter notebooks.