Quick Start

Get up and running with HypoTestX in under five minutes.

Installation

pip install hypotestx

No mandatory external dependencies — all statistical math and HTTP calls are pure Python stdlib.

Optional extras for additional features:

pip install hypotestx[visualization]   # matplotlib + plotly for plots
pip install hypotestx[dev]             # testing and linting tools
pip install hypotestx[docs]            # sphinx, furo, myst-parser
pip install hypotestx[all]             # all optional extras

Your First Test

import hypotestx as hx
import pandas as pd

df = pd.read_csv("survey.csv")

# Zero config — built-in regex router, no API key needed
result = hx.analyze(df, "Do males earn more than females?")
print(result.summary())

Example output

[ Welch's t-test (unequal variances) ]
=======================================
Result: SIGNIFICANT (alpha = 0.05)
Test statistic: 3.2456
p-value: 0.0012
Degrees of freedom: 248.0
Cohen's d: 0.6834 (medium)
95% Confidence Interval: [1.2300, 4.5600]

Interpretation:
There is a statistically significant difference between the two groups
(t = 3.25, df = 248, p = 0.0012, Cohen's d = 0.68).

Reading the HypoResult

Every test returns a HypoResult object with the following key attributes:

Attribute

Type

Description

result.test_name

str

Human-readable test name

result.statistic

float

Test statistic value (t, F, χ², U, …)

result.p_value

float

p-value

result.is_significant

bool

True if p_value < alpha

result.effect_size

float

Effect size (Cohen’s d, r, η², V, …)

result.effect_size_name

str

Name of the effect size measure

result.effect_magnitude

str

'negligible', 'small', 'medium', 'large'

result.confidence_interval

tuple

(lower, upper) confidence interval

result.degrees_of_freedom

int/float

Degrees of freedom

result.sample_sizes

int/tuple

Sample size(s)

result.interpretation

str

Plain-English interpretation

result.routing_confidence

float

1.0 for LLM, 0.6 for regex fallback

result.routing_source

str

'llm' or 'fallback'

result.summary()

str

Formatted multi-line summary

result.to_dict()

dict

All fields as a plain dict


Using a Real LLM Backend

The default regex fallback is fast and works offline but has limited accuracy on complex questions. Use a real LLM backend for production:

Google Gemini (free tier — 1 500 req/day)

import os
import hypotestx as hx

result = hx.analyze(
    df,
    "Is there a salary difference between engineering and sales departments?",
    backend="gemini",
    api_key=os.environ["GEMINI_API_KEY"],
    model="gemini-2.0-flash",
    temperature=0.0,
)
print(result.summary())

Groq (free tier, very fast)

result = hx.analyze(
    df,
    "Is employee satisfaction correlated with tenure?",
    backend="groq",
    api_key=os.environ["GROQ_API_KEY"],
    model="llama-3.3-70b-versatile",
)

Ollama (fully offline, no API key)

# 1. Install Ollama: https://ollama.com
# 2. Pull a model
ollama pull llama3.2
result = hx.analyze(
    df,
    "Are there differences in performance scores across teams?",
    backend="ollama",
    model="llama3.2",
)

Direct API

If you already know which test you want, call it directly with full parameter control:

import hypotestx as hx

# Two-sample t-test
males   = df[df["gender"] == "M"]["salary"].tolist()
females = df[df["gender"] == "F"]["salary"].tolist()
result  = hx.ttest_2samp(males, females, alternative="greater", equal_var=False)

# Pearson correlation
result = hx.pearson(df["age"].tolist(), df["salary"].tolist())

# One-way ANOVA
groups = [df[df["dept"] == d]["score"].tolist() for d in df["dept"].unique()]
result = hx.anova_1way(*groups, alpha=0.01)

print(result.p_value)
print(result.effect_magnitude)

See Direct API for a full reference of all 12 test functions.