完整 Async API 效能測試指南
目錄
測試架構概述
測試目標
比較 Python、C++、Rust 三種語言的 async HTTP client 在下單 API 場景中的效能表現。
關鍵指標
- Round Trip Time (RTT): Client 發送請求到收到回應的總時間
- Server Latency: Server 收到請求時間 - Client 發送時間
- Throughput: 每秒處理的請求數 (RPS)
- P50/P95/P99 延遲: 延遲分佈的百分位數
測試參數
- 總請求數: 1000-10000
- 並發數: 50-200
- Payload 大小: ~200 bytes (模擬真實下單資料)
Server 端實作
選項 1: Rust Server (Actix-web) 【推薦】
Cargo.toml
[package]
name = "rust_server"
version = "0.1.0"
edition = "2021"
[dependencies]
actix-web = "4"
tokio = { version = "1", features = ["full"] }
serde = { version = "1.0", features = ["derive"] }
serde_json = "1.0"
chrono = "0.4"
env_logger = "0.11"
src/main.rs
use actix_web::{web, App, HttpServer, HttpResponse, middleware}; use serde::{Deserialize, Serialize}; use std::time::{SystemTime, UNIX_EPOCH}; use std::sync::atomic::{AtomicUsize, Ordering}; use std::sync::Arc; #[derive(Deserialize)] struct Order { order_id: String, symbol: String, quantity: i32, price: f64, timestamp: u128, } #[derive(Serialize)] struct OrderResponse { status: String, order_id: String, server_receive_time: u128, client_send_time: u128, latency_ns: i128, } struct AppState { request_count: AtomicUsize, } async fn place_order( order: web::Json<Order>, data: web::Data<Arc<AppState>>, ) -> HttpResponse { let server_receive_time = SystemTime::now() .duration_since(UNIX_EPOCH) .unwrap() .as_nanos(); data.request_count.fetch_add(1, Ordering::SeqCst); let latency_ns = server_receive_time as i128 - order.timestamp as i128; let response = OrderResponse { status: "success".to_string(), order_id: order.order_id.clone(), server_receive_time, client_send_time: order.timestamp, latency_ns, }; HttpResponse::Ok().json(response) } async fn get_stats(data: web::Data<Arc<AppState>>) -> HttpResponse { let count = data.request_count.load(Ordering::SeqCst); HttpResponse::Ok().json(serde_json::json!({ "total_requests": count })) } #[actix_web::main] async fn main() -> std::io::Result<()> { env_logger::init_from_env(env_logger::Env::new().default_filter_or("warn")); println!("Starting Rust server on port 8000..."); let app_state = Arc::new(AppState { request_count: AtomicUsize::new(0), }); HttpServer::new(move || { App::new() .app_data(web::Data::new(app_state.clone())) .route("/order", web::post().to(place_order)) .route("/stats", web::get().to(get_stats)) }) .workers(8) .bind("0.0.0.0:8000")? .run() .await }
選項 2: Go Server (Gin)
go.mod
module server
go 1.21
require github.com/gin-gonic/gin v1.9.1
server.go
package main
import (
"fmt"
"net/http"
"sync/atomic"
"time"
"github.com/gin-gonic/gin"
)
type Order struct {
OrderID string `json:"order_id"`
Symbol string `json:"symbol"`
Quantity int `json:"quantity"`
Price float64 `json:"price"`
Timestamp int64 `json:"timestamp"`
}
type OrderResponse struct {
Status string `json:"status"`
OrderID string `json:"order_id"`
ServerReceiveTime int64 `json:"server_receive_time"`
ClientSendTime int64 `json:"client_send_time"`
LatencyNs int64 `json:"latency_ns"`
}
var requestCount uint64
func placeOrder(c *gin.Context) {
var order Order
if err := c.ShouldBindJSON(&order); err != nil {
c.JSON(http.StatusBadRequest, gin.H{"error": err.Error()})
return
}
serverReceiveTime := time.Now().UnixNano()
atomic.AddUint64(&requestCount, 1)
latencyNs := serverReceiveTime - order.Timestamp
response := OrderResponse{
Status: "success",
OrderID: order.OrderID,
ServerReceiveTime: serverReceiveTime,
ClientSendTime: order.Timestamp,
LatencyNs: latencyNs,
}
c.JSON(http.StatusOK, response)
}
func getStats(c *gin.Context) {
count := atomic.LoadUint64(&requestCount)
c.JSON(http.StatusOK, gin.H{
"total_requests": count,
})
}
func main() {
gin.SetMode(gin.ReleaseMode)
r := gin.New()
r.Use(gin.Recovery())
r.POST("/order", placeOrder)
r.GET("/stats", getStats)
fmt.Println("Starting Go server on port 8000...")
r.Run(":8000")
}
選項 3: Python Server (FastAPI + uvloop) 【備用】
requirements.txt
fastapi==0.109.0
uvicorn[standard]==0.27.0
uvloop==0.19.0
pydantic==2.5.0
optimized_server.py
import asyncio
import uvloop
import multiprocessing
from fastapi import FastAPI
from pydantic import BaseModel
import uvicorn
import time
asyncio.set_event_loop_policy(uvloop.EventLoopPolicy())
app = FastAPI()
class Order(BaseModel):
order_id: str
symbol: str
quantity: int
price: float
timestamp: int
@app.post("/order")
async def place_order(order: Order):
server_receive_time = time.time_ns()
return {
"status": "success",
"order_id": order.order_id,
"server_receive_time": server_receive_time,
"client_send_time": order.timestamp,
"latency_ns": server_receive_time - order.timestamp
}
@app.get("/stats")
async def get_stats():
return {"status": "ok"}
if __name__ == "__main__":
workers = multiprocessing.cpu_count()
uvicorn.run(
"optimized_server:app",
host="0.0.0.0",
port=8000,
workers=workers,
loop="uvloop",
log_level="warning",
access_log=False
)
Client 端實作
Python Client
requirements.txt
aiohttp==3.9.0
asyncio==3.4.3
python_client.py
import asyncio
import aiohttp
import time
import json
from typing import List
import statistics
class PythonBenchmark:
def __init__(self, base_url="http://localhost:8000"):
self.base_url = base_url
self.results = []
async def send_order(self, session: aiohttp.ClientSession, order_id: int):
order = {
"order_id": f"PY_{order_id}",
"symbol": "AAPL",
"quantity": 100,
"price": 150.25,
"timestamp": time.time_ns()
}
start = time.time_ns()
try:
async with session.post(f'{self.base_url}/order', json=order) as response:
result = await response.json()
end = time.time_ns()
return {
"round_trip_ns": end - start,
"server_latency_ns": result["latency_ns"],
"success": True
}
except Exception as e:
return {
"round_trip_ns": 0,
"server_latency_ns": 0,
"success": False,
"error": str(e)
}
async def benchmark(self, num_requests: int = 1000, concurrent: int = 100):
connector = aiohttp.TCPConnector(limit=concurrent, force_close=True)
timeout = aiohttp.ClientTimeout(total=30)
async with aiohttp.ClientSession(connector=connector, timeout=timeout) as session:
tasks = []
all_results = []
for i in range(num_requests):
task = self.send_order(session, i)
tasks.append(task)
if len(tasks) >= concurrent:
results = await asyncio.gather(*tasks)
all_results.extend(results)
tasks = []
if tasks:
results = await asyncio.gather(*tasks)
all_results.extend(results)
return all_results
def print_stats(self, results: List[dict], duration: float):
successful = [r for r in results if r["success"]]
failed = len(results) - len(successful)
if not successful:
print("All requests failed!")
return
round_trips = [r["round_trip_ns"] for r in successful]
server_latencies = [r["server_latency_ns"] for r in successful]
round_trips.sort()
server_latencies.sort()
print(f"\n{'='*50}")
print(f"Python Client Results")
print(f"{'='*50}")
print(f"Total Time: {duration:.2f}s")
print(f"Total Requests: {len(results)}")
print(f"Successful: {len(successful)}")
print(f"Failed: {failed}")
print(f"Throughput: {len(successful)/duration:.2f} req/s")
print(f"\nRound Trip Time:")
print(f" Average: {statistics.mean(round_trips)/1e6:.2f}ms")
print(f" P50: {round_trips[len(round_trips)//2]/1e6:.2f}ms")
print(f" P95: {round_trips[int(len(round_trips)*0.95)]/1e6:.2f}ms")
print(f" P99: {round_trips[int(len(round_trips)*0.99)]/1e6:.2f}ms")
print(f"\nServer Latency:")
print(f" Average: {statistics.mean(server_latencies)/1e6:.2f}ms")
async def main():
benchmark = PythonBenchmark()
print("Python Client Benchmark Starting...")
print("Warming up...")
await benchmark.benchmark(num_requests=100, concurrent=10)
print("Running benchmark...")
start_time = time.time()
results = await benchmark.benchmark(num_requests=5000, concurrent=100)
duration = time.time() - start_time
benchmark.print_stats(results, duration)
if __name__ == "__main__":
asyncio.run(main())
C++ Client
CMakeLists.txt
cmake_minimum_required(VERSION 3.10)
project(cpp_client)
set(CMAKE_CXX_STANDARD 17)
find_package(Threads REQUIRED)
# 使用 vcpkg 或手動安裝這些庫
find_package(cpr CONFIG REQUIRED)
find_package(nlohmann_json CONFIG REQUIRED)
add_executable(cpp_client cpp_client.cpp)
target_link_libraries(cpp_client
PRIVATE
cpr::cpr
nlohmann_json::nlohmann_json
Threads::Threads
)
cpp_client.cpp
#include <iostream>
#include <chrono>
#include <vector>
#include <future>
#include <algorithm>
#include <numeric>
#include <thread>
#include <cpr/cpr.h>
#include <nlohmann/json.hpp>
using json = nlohmann::json;
using namespace std::chrono;
struct Result {
long long round_trip_ns;
long long server_latency_ns;
bool success;
};
class CppBenchmark {
private:
std::string base_url;
public:
CppBenchmark(const std::string& url = "http://localhost:8000")
: base_url(url) {}
Result send_order(int order_id) {
auto now = duration_cast<nanoseconds>(
system_clock::now().time_since_epoch()
).count();
json order;
order["order_id"] = "CPP_" + std::to_string(order_id);
order["symbol"] = "AAPL";
order["quantity"] = 100;
order["price"] = 150.25;
order["timestamp"] = now;
auto start = high_resolution_clock::now();
try {
auto response = cpr::Post(
cpr::Url{base_url + "/order"},
cpr::Header{{"Content-Type", "application/json"}},
cpr::Body{order.dump()}
);
auto end = high_resolution_clock::now();
auto round_trip = duration_cast<nanoseconds>(end - start).count();
if (response.status_code == 200) {
json resp_json = json::parse(response.text);
return {
round_trip,
resp_json["latency_ns"],
true
};
}
} catch (const std::exception& e) {
// Handle error
}
return {0, 0, false};
}
std::vector<Result> benchmark(int num_requests, int concurrent) {
std::vector<std::future<Result>> futures;
std::vector<Result> results;
for (int i = 0; i < num_requests; i++) {
futures.push_back(
std::async(std::launch::async,
&CppBenchmark::send_order, this, i)
);
if (futures.size() >= concurrent) {
for (auto& f : futures) {
results.push_back(f.get());
}
futures.clear();
}
}
for (auto& f : futures) {
results.push_back(f.get());
}
return results;
}
void print_stats(const std::vector<Result>& results, double duration) {
std::vector<Result> successful;
std::copy_if(results.begin(), results.end(),
std::back_inserter(successful),
[](const Result& r) { return r.success; });
if (successful.empty()) {
std::cout << "All requests failed!" << std::endl;
return;
}
std::vector<long long> round_trips;
std::vector<long long> server_latencies;
for (const auto& r : successful) {
round_trips.push_back(r.round_trip_ns);
server_latencies.push_back(r.server_latency_ns);
}
std::sort(round_trips.begin(), round_trips.end());
std::sort(server_latencies.begin(), server_latencies.end());
double avg_rt = std::accumulate(round_trips.begin(),
round_trips.end(), 0.0) / round_trips.size();
double avg_sl = std::accumulate(server_latencies.begin(),
server_latencies.end(), 0.0) / server_latencies.size();
std::cout << "\n==================================================" << std::endl;
std::cout << "C++ Client Results" << std::endl;
std::cout << "==================================================" << std::endl;
std::cout << "Total Time: " << duration << "s" << std::endl;
std::cout << "Total Requests: " << results.size() << std::endl;
std::cout << "Successful: " << successful.size() << std::endl;
std::cout << "Failed: " << results.size() - successful.size() << std::endl;
std::cout << "Throughput: " << successful.size() / duration << " req/s" << std::endl;
std::cout << "\nRound Trip Time:" << std::endl;
std::cout << " Average: " << avg_rt / 1e6 << "ms" << std::endl;
std::cout << " P50: " << round_trips[round_trips.size()/2] / 1e6 << "ms" << std::endl;
std::cout << " P95: " << round_trips[round_trips.size()*95/100] / 1e6 << "ms" << std::endl;
std::cout << " P99: " << round_trips[round_trips.size()*99/100] / 1e6 << "ms" << std::endl;
std::cout << "\nServer Latency:" << std::endl;
std::cout << " Average: " << avg_sl / 1e6 << "ms" << std::endl;
}
};
int main() {
CppBenchmark benchmark;
std::cout << "C++ Client Benchmark Starting..." << std::endl;
std::cout << "Warming up..." << std::endl;
benchmark.benchmark(100, 10);
std::cout << "Running benchmark..." << std::endl;
auto start = high_resolution_clock::now();
auto results = benchmark.benchmark(5000, 100);
auto end = high_resolution_clock::now();
double duration = duration_cast<milliseconds>(end - start).count() / 1000.0;
benchmark.print_stats(results, duration);
return 0;
}
Rust Client
Cargo.toml
[package]
name = "rust_client"
version = "0.1.0"
edition = "2021"
[dependencies]
tokio = { version = "1", features = ["full"] }
reqwest = { version = "0.11", features = ["json"] }
serde = { version = "1.0", features = ["derive"] }
serde_json = "1.0"
futures = "0.3"
src/main.rs
use reqwest; use serde::{Deserialize, Serialize}; use std::time::{SystemTime, UNIX_EPOCH, Instant}; use tokio; use futures::future::join_all; #[derive(Serialize, Deserialize)] struct Order { order_id: String, symbol: String, quantity: i32, price: f64, timestamp: u128, } #[derive(Deserialize)] struct OrderResponse { status: String, order_id: String, server_receive_time: u128, client_send_time: u128, latency_ns: i128, } #[derive(Debug, Clone)] struct Result { round_trip_ns: u128, server_latency_ns: i128, success: bool, } struct RustBenchmark { base_url: String, client: reqwest::Client, } impl RustBenchmark { fn new(base_url: &str) -> Self { let client = reqwest::Client::builder() .pool_max_idle_per_host(200) .build() .unwrap(); Self { base_url: base_url.to_string(), client, } } async fn send_order(&self, order_id: i32) -> Result { let timestamp = SystemTime::now() .duration_since(UNIX_EPOCH) .unwrap() .as_nanos(); let order = Order { order_id: format!("RUST_{}", order_id), symbol: "AAPL".to_string(), quantity: 100, price: 150.25, timestamp, }; let start = Instant::now(); match self.client .post(format!("{}/order", self.base_url)) .json(&order) .send() .await { Ok(response) => { match response.json::<OrderResponse>().await { Ok(resp) => { let round_trip = start.elapsed().as_nanos(); Result { round_trip_ns: round_trip, server_latency_ns: resp.latency_ns, success: true, } } Err(_) => Result { round_trip_ns: 0, server_latency_ns: 0, success: false, } } } Err(_) => Result { round_trip_ns: 0, server_latency_ns: 0, success: false, } } } async fn benchmark(&self, num_requests: usize, concurrent: usize) -> Vec<Result> { let mut tasks = Vec::new(); let mut all_results = Vec::new(); for i in 0..num_requests { let task = self.send_order(i as i32); tasks.push(task); if tasks.len() >= concurrent { let results = join_all(tasks).await; all_results.extend(results); tasks = Vec::new(); } } if !tasks.is_empty() { let results = join_all(tasks).await; all_results.extend(results); } all_results } fn print_stats(&self, results: &[Result], duration: f64) { let successful: Vec<&Result> = results.iter() .filter(|r| r.success) .collect(); if successful.is_empty() { println!("All requests failed!"); return; } let mut round_trips: Vec<u128> = successful.iter() .map(|r| r.round_trip_ns) .collect(); let mut server_latencies: Vec<i128> = successful.iter() .map(|r| r.server_latency_ns) .collect(); round_trips.sort(); server_latencies.sort(); let avg_rt = round_trips.iter().sum::<u128>() as f64 / round_trips.len() as f64; let avg_sl = server_latencies.iter().sum::<i128>() as f64 / server_latencies.len() as f64; println!("\n{}", "=".repeat(50)); println!("Rust Client Results"); println!("{}", "=".repeat(50)); println!("Total Time: {:.2}s", duration); println!("Total Requests: {}", results.len()); println!("Successful: {}", successful.len()); println!("Failed: {}", results.len() - successful.len()); println!("Throughput: {:.2} req/s", successful.len() as f64 / duration); println!("\nRound Trip Time:"); println!(" Average: {:.2}ms", avg_rt / 1e6); println!(" P50: {:.2}ms", round_trips[round_trips.len()/2] as f64 / 1e6); println!(" P95: {:.2}ms", round_trips[round_trips.len()*95/100] as f64 / 1e6); println!(" P99: {:.2}ms", round_trips[round_trips.len()*99/100] as f64 / 1e6); println!("\nServer Latency:"); println!(" Average: {:.2}ms", avg_sl / 1e6); } } #[tokio::main] async fn main() { let benchmark = RustBenchmark::new("http://localhost:8000"); println!("Rust Client Benchmark Starting..."); println!("Warming up..."); let _ = benchmark.benchmark(100, 10).await; println!("Running benchmark..."); let start = Instant::now(); let results = benchmark.benchmark(5000, 100).await; let duration = start.elapsed().as_secs_f64(); benchmark.print_stats(&results, duration); }
環境設置
系統優化
Linux 系統優化 (Ubuntu/Debian)
# 增加文件描述符限制
sudo bash -c 'echo "* soft nofile 65535" >> /etc/security/limits.conf'
sudo bash -c 'echo "* hard nofile 65535" >> /etc/security/limits.conf'
# TCP 優化
sudo sysctl -w net.core.somaxconn=65535
sudo sysctl -w net.ipv4.tcp_max_syn_backlog=65535
sudo sysctl -w net.ipv4.ip_local_port_range="1024 65535"
sudo sysctl -w net.ipv4.tcp_tw_reuse=1
# 永久保存
sudo bash -c 'echo "net.core.somaxconn=65535" >> /etc/sysctl.conf'
sudo bash -c 'echo "net.ipv4.tcp_max_syn_backlog=65535" >> /etc/sysctl.conf'
依賴安裝
Python 環境
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
Rust 環境
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
source $HOME/.cargo/env
C++ 環境 (使用 vcpkg)
# 安裝 vcpkg
git clone https://github.com/Microsoft/vcpkg.git
cd vcpkg
./bootstrap-vcpkg.sh
./vcpkg integrate install
# 安裝依賴
./vcpkg install cpr nlohmann-json
Go 環境
# 下載並安裝 Go
wget https://go.dev/dl/go1.21.5.linux-amd64.tar.gz
sudo tar -C /usr/local -xzf go1.21.5.linux-amd64.tar.gz
export PATH=$PATH:/usr/local/go/bin
執行測試
自動化測試腳本
run_complete_benchmark.sh
#!/bin/bash
# Colors for output
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
NC='\033[0m' # No Color
# Configuration
NUM_REQUESTS=5000
CONCURRENT=100
WARMUP_REQUESTS=100
echo -e "${GREEN}=== Complete Async API Benchmark ===${NC}"
echo "Configuration:"
echo " Requests: $NUM_REQUESTS"
echo " Concurrent: $CONCURRENT"
echo ""
# Function to check if port is in use
check_port() {
lsof -Pi :8000 -sTCP:LISTEN -t >/dev/null
}
# Function to wait for server
wait_for_server() {
echo -n "Waiting for server to start"
for i in {1..30}; do
if curl -s http://localhost:8000/stats > /dev/null; then
echo -e " ${GREEN}✓${NC}"
return 0
fi
echo -n "."
sleep 1
done
echo -e " ${RED}✗${NC}"
return 1
}
# Function to run client benchmark
run_client() {
local client_name=$1
local client_cmd=$2
echo -e "\n${YELLOW}Testing $client_name Client...${NC}"
eval $client_cmd
}
# Kill any existing server
if check_port; then
echo "Killing existing server on port 8000..."
kill $(lsof -Pi :8000 -sTCP:LISTEN -t)
sleep 2
fi
# Test with different servers
servers=("rust" "go" "python")
for server in "${servers[@]}"; do
echo -e "\n${GREEN}═══════════════════════════════════════${NC}"
echo -e "${GREEN}Testing with $server server${NC}"
echo -e "${GREEN}═══════════════════════════════════════${NC}"
# Start server
case $server in
"rust")
echo "Building and starting Rust server..."
cd rust_server && cargo build --release
./target/release/rust_server &
;;
"go")
echo "Building and starting Go server..."
cd go_server && go build
./server &
;;
"python")
echo "Starting Python server..."
python3 optimized_server.py &
;;
esac
SERVER_PID=$!
cd ..
# Wait for server to be ready
if ! wait_for_server; then
echo -e "${RED}Server failed to start!${NC}"
kill $SERVER_PID 2>/dev/null
continue
fi
# Run all clients
run_client "Python" "python3 python_client.py"
run_client "C++" "./cpp_client/build/cpp_client"
run_client "Rust" "cd rust_client && cargo run --release && cd .."
# Get server stats
echo -e "\n${YELLOW}Server Stats:${NC}"
curl -s http://localhost:8000/stats | jq .
# Stop server
echo -e "\nStopping server..."
kill $SERVER_PID
wait $SERVER_PID 2>/dev/null
sleep 2
done
echo -e "\n${GREEN}═══════════════════════════════════════${NC}"
echo -e "${GREEN}Benchmark Complete!${NC}"
echo -e "${GREEN}═══════════════════════════════════════${NC}"
單獨測試腳本
test_single.sh
#!/bin/bash
SERVER=$1
CLIENT=$2
if [ -z "$SERVER" ] || [ -z "$CLIENT" ]; then
echo "Usage: ./test_single.sh [rust|go|python] [rust|cpp|python]"
exit 1
fi
# Start server
case $SERVER in
"rust")
cd rust_server && cargo run --release &
;;
"go")
cd go_server && go run server.go &
;;
"python")
python3 optimized_server.py &
;;
esac
SERVER_PID=$!
sleep 3
# Run client
case $CLIENT in
"rust")
cd rust_client && cargo run --release
;;
"cpp")
./cpp_client/build/cpp_client
;;
"python")
python3 python_client.py
;;
esac
kill $SERVER_PID
效能分析
預期效能結果
Server 效能比較
| Server | 延遲 (P50) | 延遲 (P99) | 最大 RPS | CPU 使用率 |
|---|---|---|---|---|
| Rust | 10-30μs | 50-100μs | 100k+ | 30-50% |
| Go | 20-50μs | 100-200μs | 50k+ | 40-60% |
| Python | 100-300μs | 500-1000μs | 10k+ | 70-90% |
Client 效能比較
| Client | 延遲 (P50) | 延遲 (P99) | 並發能力 | 記憶體使用 |
|---|---|---|---|---|
| Rust | 0.5-2ms | 5-10ms | 極高 | 低 |
| C++ | 0.8-3ms | 8-15ms | 高 | 低 |
| Python | 2-8ms | 15-30ms | 中 | 高 |
效能監控工具
系統監控
# CPU 和記憶體監控
htop
# 網路連線監控
netstat -an | grep :8000 | wc -l
# IO 監控
iotop
# 詳細系統資訊
dstat -tcmndylp
壓力測試工具
# 使用 wrk 測試 server 極限
wrk -t12 -c400 -d30s --latency \
-s post.lua \
http://localhost:8000/order
# post.lua 內容
cat > post.lua << 'EOF'
wrk.method = "POST"
wrk.body = '{"order_id":"TEST_1","symbol":"AAPL","quantity":100,"price":150.25,"timestamp":1234567890}'
wrk.headers["Content-Type"] = "application/json"
EOF
# 使用 ab 測試
ab -n 10000 -c 100 -p order.json -T application/json \
http://localhost:8000/order
結果分析要點
-
延遲分析
- Round Trip Time = 網路延遲 + Server 處理時間 + Client 處理時間
- Server Latency 主要反映網路延遲
- 差值反映 Client 和 Server 的處理效率
-
吞吐量分析
- RPS (Requests Per Second) 越高越好
- 注意觀察是否達到瓶頸(CPU、網路、連線數)
-
穩定性分析
- P99 與 P50 的差距反映系統穩定性
- 差距越小表示效能越穩定
-
資源使用分析
- CPU 使用率不應超過 80%
- 記憶體應保持穩定,無洩漏
- 檔案描述符使用量要在限制內
優化建議
通用優化
-
連線池管理
- 適當的連線池大小
- Keep-alive 連線重用
- 連線超時設定
-
並發控制
- 根據 CPU 核心數調整並發
- 使用背壓(backpressure)機制
- 避免過度並發導致效能下降
-
協議優化
- 考慮使用 HTTP/2
- 使用二進位協議(如 gRPC)
- 減少 payload 大小
語言特定優化
Python:
- 使用 uvloop 替代默認 event loop
- 考慮 PyPy 或 Cython
- 使用 httpx 替代 aiohttp
Rust:
- 調整 tokio runtime workers
- 使用 hyper 直接操作
- 啟用 LTO (Link Time Optimization)
C++:
- 使用 jemalloc 或 tcmalloc
- 編譯器優化 flags (-O3, -march=native)
- 考慮使用 boost.beast
故障排除
常見問題
-
"Too many open files" 錯誤
ulimit -n 65535 -
連線被拒絕
- 檢查 server 是否啟動
- 檢查防火牆設定
- 確認 port 沒被佔用
-
高延遲
- 檢查 CPU 使用率
- 檢查網路延遲
- 調整並發數
-
記憶體洩漏
- 使用 valgrind (C++)
- 使用 memory profiler (Python)
- 使用 heaptrack (Rust)
總結
這個完整的測試框架可以幫助你:
- 準確測量三種語言的 async HTTP client 效能
- 避免 server 成為瓶頸,確保測試結果反映 client 真實效能
- 全面的指標包括延遲、吞吐量、穩定性
- 可重複執行的自動化測試流程
根據測試結果,你可以為不同場景選擇最適合的語言:
- Rust: 最高效能,適合高頻交易
- C++: 高效能,適合既有 C++ 系統整合
- Python: 開發快速,適合原型開發和中低頻交易