TypeScript용 Weave 빠른 시작 가이드

TypeScript와 함께 W&B Weave를 사용하면 다음을 수행할 수 있습니다:

언어 모델 입력, 출력 및 트레이스를 로깅하고 디버깅할 수 있습니다
언어 모델 활용 사례에 대해 동일 조건에서 공정하게 비교 가능한 엄밀한 평가를 구축할 수 있습니다
실험부터 평가, 프로덕션까지 LLM 워크플로 전반에서 생성되는 모든 정보를 체계적으로 정리할 수 있습니다

자세한 내용은 Weave 문서를 참조하세요.

함수 추적

TypeScript 코드에서 Weave를 사용하려면, 새 Weave 프로젝트를 초기화한 다음 추적하려는 함수에 weave.op 래퍼를 추가하세요. weave.op을 추가하고 해당 함수를 호출한 뒤, W&B 대시보드로 이동해 프로젝트 내에서 함수가 어떻게 추적되고 있는지 확인하세요. 코드는 자동으로 추적됩니다. UI의 코드 탭을 확인하세요!

async function initializeWeaveProject() {
    const PROJECT = 'weave-examples';
    await weave.init(PROJECT);
}

const stripUserInput = weave.op(function stripUserInput(userInput: string): string {
    return userInput.trim();
});

다음 예제는 함수 추적의 기본적인 동작 방식을 보여줍니다.

async function demonstrateBasicTracking() {
    const result = await stripUserInput("    hello    ");
    console.log('Basic tracking result:', result);
}

OpenAI 통합

Weave는 다음과 같은 모든 OpenAI 호출을 자동으로 추적합니다:

토큰 사용량
API 비용
요청/응답 쌍
모델 구성

OpenAI 외에도 Weave는 Anthropic 및 Mistral과 같은 다른 LLM 제공자의 자동 로깅을 지원합니다. 전체 목록은 통합 문서의 LLM 제공자를 참조하세요.

function initializeOpenAIClient() {
    return weave.wrapOpenAI(new OpenAI({
        apiKey: process.env.OPENAI_API_KEY
    }));
}

async function demonstrateOpenAITracking() {
    const client = initializeOpenAIClient();
    const result = await client.chat.completions.create({
        model: "gpt-4-turbo",
        messages: [{ role: "user", content: "Hello, how are you?" }],
    });
    console.log('OpenAI tracking result:', result);
}

중첩 함수 추적

Weave를 사용하면 여러 추적 함수와 LLM 호출을 조합하면서 전체 실행 트레이스를 그대로 유지하여 복잡한 워크플로를 추적할 수 있습니다. 이를 통해 다음과 같은 이점을 얻을 수 있습니다:

애플리케이션 로직 흐름에 대한 완전한 가시성
복잡한 연산 체인의 손쉬운 디버깅
성능을 최적화할 수 있는 다양한 기회

async function demonstrateNestedTracking() {
    const client = initializeOpenAIClient();
    
    const correctGrammar = weave.op(async function correctGrammar(userInput: string): Promise<string> {
        const stripped = await stripUserInput(userInput);
        const response = await client.chat.completions.create({
            model: "gpt-4-turbo",
            messages: [
                {
                    role: "system",
                    content: "You are a grammar checker, correct the following user input."
                },
                { role: "user", content: stripped }
            ],
            temperature: 0,
        });
        return response.choices[0].message.content ?? '';
    });

    const grammarResult = await correctGrammar("That was so easy, it was a piece of pie!");
    console.log('Nested tracking result:', grammarResult);
}

데이터셋 관리

weave.Dataset 클래스를 사용해 Weave에서 데이터셋을 생성하고 관리할 수 있습니다. Weave Models와 마찬가지로 weave.Dataset은 다음과 같은 작업에 도움이 됩니다:

데이터를 추적하고 버전 관리하기
테스트 케이스 구성하기
팀 구성원 간 데이터셋 공유하기
체계적인 평가 수행하기

interface GrammarExample {
    userInput: string;
    expected: string;
}

function createGrammarDataset(): weave.Dataset<GrammarExample> {
    return new weave.Dataset<GrammarExample>({
        id: 'grammar-correction',
        rows: [
            {
                userInput: "That was so easy, it was a piece of pie!",
                expected: "That was so easy, it was a piece of cake!"
            },
            {
                userInput: "I write good",
                expected: "I write well"
            },
            {
                userInput: "LLM's are best",
                expected: "LLM's are the best"
            }
        ]
    });
}

평가 프레임워크

Weave는 Evaluation 클래스를 통해 평가 주도형 개발을 지원합니다. Evaluation은 GenAI 애플리케이션을 신뢰성 있게 반복·개선하는 데 도움을 줍니다. Evaluation 클래스는 다음을 수행합니다:

Dataset에 대한 Model 성능 평가
사용자 정의 점수 함수 적용
상세한 성능 보고서 생성
모델 버전 간 성능 비교 지원

전체 평가 튜토리얼은 http://wandb.me/weave_eval_tut에서 확인할 수 있습니다.

class OpenAIGrammarCorrector {
    private oaiClient: ReturnType<typeof weave.wrapOpenAI>;
    
    constructor() {
        this.oaiClient = weave.wrapOpenAI(new OpenAI({
            apiKey: process.env.OPENAI_API_KEY
        }));
        this.predict = weave.op(this, this.predict);
    }

    async predict(userInput: string): Promise<string> {
        const response = await this.oaiClient.chat.completions.create({
            model: 'gpt-4-turbo',
            messages: [
                { 
                    role: "system", 
                    content: "You are a grammar checker, correct the following user input." 
                },
                { role: "user", content: userInput }
            ],
            temperature: 0
        });
        return response.choices[0].message.content ?? '';
    }
}

async function runEvaluation() {
    const corrector = new OpenAIGrammarCorrector();
    const dataset = createGrammarDataset();
    
    const exactMatch = weave.op(
        function exactMatch({ modelOutput, datasetRow }: { 
            modelOutput: string; 
            datasetRow: GrammarExample 
        }): { match: boolean } {
            return { match: datasetRow.expected === modelOutput };
        },
        { name: 'exactMatch' }
    );

    const evaluation = new weave.Evaluation({
        dataset,
        scorers: [exactMatch],
    });

    const summary = await evaluation.evaluate({
        model: weave.op((args: { datasetRow: GrammarExample }) => 
            corrector.predict(args.datasetRow.userInput)
        )
    });
    console.log('평가 요약:', summary);
}

아래 main 함수는 모든 데모를 실행합니다.

async function main() {
    try {
        await initializeWeaveProject();
        await demonstrateBasicTracking();
        await demonstrateOpenAITracking();
        await demonstrateNestedTracking();
        await runEvaluation();
    } catch (error) {
        console.error('데모 실행 중 오류 발생:', error);
    }
}

Deployment options

Configure W&B

Monitoring and usage

Resources

소개용 노트북

TypeScript용 Weave 빠른 시작 가이드

함수 추적

OpenAI 통합

중첩 함수 추적

데이터셋 관리

평가 프레임워크

Deployment options

Configure W&B

Monitoring and usage

Resources

Documentation Index

​TypeScript용 Weave 빠른 시작 가이드

​함수 추적

​OpenAI 통합

​중첩 함수 추적

​데이터셋 관리

​평가 프레임워크

TypeScript용 Weave 빠른 시작 가이드

함수 추적

OpenAI 통합

중첩 함수 추적

데이터셋 관리

평가 프레임워크