Skip to content

Latest commit

 

History

History
347 lines (272 loc) · 12.6 KB

File metadata and controls

347 lines (272 loc) · 12.6 KB

Foundry Local Spring Boot Tutorial

Table of Contents

Prerequisites

Before starting this tutorial, make sure you have:

  • Java 21 or higher installed on your system
  • Maven 3.6+ for building the project
  • Foundry Local installed and running

Install Foundry Local:

Note: Foundry Local CLI is available on Windows and macOS only. Linux is supported via the Foundry Local SDKs (Python, JavaScript, C#, Rust).

# Windows
winget install Microsoft.FoundryLocal

# macOS
brew tap microsoft/foundrylocal
brew install foundrylocal

Verify the installation:

foundry --version

Project Overview

This project consists of four main components:

  1. Application.java - The main Spring Boot application entry point
  2. FoundryLocalService.java - Service layer that handles AI communication
  3. application.properties - Configuration for Foundry Local connection
  4. pom.xml - Maven dependencies and project configuration

Understanding the Code

1. Application Configuration (application.properties)

File: src/main/resources/application.properties

foundry.local.base-url=http://localhost:5273/v1
# foundry.local.model is auto-detected from Foundry Local. Set it here to override:
# foundry.local.model=Phi-4-mini-instruct-cuda-gpu:5

What this does:

  • base-url: Specifies where Foundry Local is running, including the /v1 path for OpenAI API compatibility. The default port is 5273. If the port differs, check it with foundry service status.
  • model (optional): Names the AI model to use for text generation. By default, the application auto-detects the model by querying the Foundry Local /v1/models endpoint at startup, so you don't need to set this. You can still set it explicitly to override auto-detection if needed.

Key concept: Spring Boot automatically loads these properties and makes them available to your application using the @Value annotation.

2. Main Application Class (Application.java)

File: src/main/java/com/example/Application.java

@SpringBootApplication
public class Application {
    public static void main(String[] args) {
        SpringApplication app = new SpringApplication(Application.class);
        app.setWebApplicationType(WebApplicationType.NONE);  // No web server needed
        app.run(args);
    }

What this does:

  • @SpringBootApplication enables Spring Boot auto-configuration
  • WebApplicationType.NONE tells Spring this is a command-line app, not a web server
  • The main method starts the Spring application

The Demo Runner:

@Bean
public CommandLineRunner foundryLocalRunner(FoundryLocalService foundryLocalService) {
    return args -> {
        System.out.println("=== Foundry Local Demo ===");
        System.out.println("Calling Foundry Local service...");
        
        String testMessage = "Hello! Can you tell me what you are and what model you're running?";
        System.out.println("Sending message: " + testMessage);
        
        String response = foundryLocalService.chat(testMessage);
        System.out.println("Response from Foundry Local:");
        System.out.println(response);
        System.out.println("=========================");
    };
}

What this does:

  • @Bean creates a component that Spring manages
  • CommandLineRunner runs code after Spring Boot starts up
  • foundryLocalService is automatically injected by Spring (dependency injection)
  • Sends a test message to the AI and displays the response

3. AI Service Layer (FoundryLocalService.java)

File: src/main/java/com/example/FoundryLocalService.java

Configuration Injection:

@Service
public class FoundryLocalService {
    
    @Value("${foundry.local.base-url:http://localhost:5273/v1}")
    private String baseUrl;
    
    @Value("${foundry.local.model:}")
    private String model;    // Auto-detected if empty

What this does:

  • @Service tells Spring this class provides business logic
  • @Value injects configuration values from application.properties
  • The model defaults to empty, which triggers auto-detection from Foundry Local at startup. This means the app works with any model loaded in Foundry Local without manual configuration.

Client Initialization:

@PostConstruct
public void init() {
    // Auto-detect the model from Foundry Local if not explicitly configured
    if (model == null || model.isBlank()) {
        model = detectModel();
    }

    this.openAIClient = OpenAIOkHttpClient.builder()
            .baseUrl(baseUrl)                // Base URL already includes /v1 from configuration
            .apiKey("not-needed")            // Local server doesn't need real API key
            .build();
}

What this does:

  • @PostConstruct runs this method after Spring creates the service
  • If no model is configured, it queries Foundry Local's /v1/models endpoint and picks the first loaded model
  • Creates an OpenAI client that points to your local Foundry Local instance
  • The base URL from application.properties already includes /v1 for OpenAI API compatibility
  • API key is set to "not-needed" because local development doesn't require authentication

Chat Method:

public String chat(String message) {
    try {
        ChatCompletionCreateParams params = ChatCompletionCreateParams.builder()
                .model(model)                    // Which AI model to use
                .addUserMessage(message)         // Your question/prompt
                .maxCompletionTokens(150)        // Limit response length
                .temperature(0.7)                // Control creativity (0.0-1.0)
                .build();
        
        ChatCompletion chatCompletion = openAIClient.chat().completions().create(params);
        
        // Extract the AI's response from the API result
        if (chatCompletion.choices() != null && !chatCompletion.choices().isEmpty()) {
            return chatCompletion.choices().get(0).message().content().orElse("No response found");
        }
        
        return "No response content found";
    } catch (Exception e) {
        throw new RuntimeException("Error calling chat completion: " + e.getMessage(), e);
    }
}

What this does:

  • ChatCompletionCreateParams: Configures the AI request
    • model: Specifies which AI model to use (must match the exact ID from foundry model list)
    • addUserMessage: Adds your message to the conversation
    • maxCompletionTokens: Limits how long the response can be (saves resources)
    • temperature: Controls randomness (0.0 = deterministic, 1.0 = creative)
  • API Call: Sends the request to Foundry Local
  • Response Handling: Extracts the AI's text response safely
  • Error Handling: Wraps exceptions with helpful error messages

4. Project Dependencies (pom.xml)

Key Dependencies:

<!-- Spring Boot - Application framework -->
<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter</artifactId>
    <version>${spring-boot.version}</version>
</dependency>

<!-- OpenAI Java SDK - For AI API calls -->
<dependency>
    <groupId>com.openai</groupId>
    <artifactId>openai-java</artifactId>
    <version>2.12.0</version>
</dependency>

<!-- Jackson - JSON processing -->
<dependency>
    <groupId>com.fasterxml.jackson.core</groupId>
    <artifactId>jackson-databind</artifactId>
    <version>2.17.0</version>
</dependency>

What these do:

  • spring-boot-starter: Provides core Spring Boot functionality
  • openai-java: Official OpenAI Java SDK for API communication
  • jackson-databind: Handles JSON serialization/deserialization for API calls

How It All Works Together

Here's the complete flow when you run the application:

  1. Startup: Spring Boot starts and reads application.properties
  2. Service Creation: Spring creates FoundryLocalService and injects configuration values
  3. Model Detection: If no model is configured, the service queries Foundry Local's /v1/models endpoint and uses the first available model automatically
  4. Client Setup: @PostConstruct initializes the OpenAI client to connect to Foundry Local
  5. Demo Execution: CommandLineRunner executes after startup
  6. AI Call: The demo calls foundryLocalService.chat() with a test message
  7. API Request: Service builds and sends OpenAI-compatible request to Foundry Local
  8. Response Processing: Service extracts and returns the AI's response
  9. Display: Application prints the response and exits

Setting Up Foundry Local

  1. Install Foundry Local using the instructions in the Prerequisites section.

  2. Start the service (if not already running):

    foundry service start
  3. Check the service status to confirm it is running and note the port:

    foundry service status
  4. Download and run a model (downloads on first run, cached for subsequent runs):

    foundry model run phi-4-mini

    This opens an interactive chat session. You can exit with Ctrl+C. The model stays loaded in the service.

    Tip: Run foundry model list to see all available models. Replace phi-4-mini with any alias from the catalog (e.g., qwen2.5-0.5b for a smaller/faster model).

  5. Verify the model is loaded:

    foundry service ps
  6. Update application.properties if needed:

    • The default base-url (http://localhost:5273/v1) matches the default CLI port. Update only if foundry service status shows a different port.
    • The model is auto-detected at startup — no configuration needed.
    foundry.local.base-url=http://localhost:5273/v1
    # Model is auto-detected. Uncomment below to override:
    # foundry.local.model=Phi-4-mini-instruct-cuda-gpu:5

Running the Application

Step 1: Ensure a model is loaded in Foundry Local

foundry service ps

If no models are listed, load one:

foundry model run phi-4-mini

Step 2: Build and Run the Application

In a separate terminal:

cd 04-PracticalSamples/foundrylocal
mvn spring-boot:run

Or build and run as a JAR:

mvn clean package
java -jar target/foundry-local-spring-boot-0.0.1-SNAPSHOT.jar

Expected Output

=== Foundry Local Demo ===
Calling Foundry Local service...
Sending message: Hello! Can you tell me what you are and what model you're running?
Response from Foundry Local:
Hello! I'm Phi, an AI developed by Microsoft. I can assist with a wide variety of 
tasks including answering questions, helping with analysis, creative writing, coding, 
and general conversation. How can I help you today?
=========================

Next Steps

For more examples, see Chapter 04: Practical samples

Troubleshooting

Common Issues

"Connection refused" or "Service unavailable"

  • Check the service: foundry service status
  • Restart if needed: foundry service restart
  • Verify the port in application.properties matches foundry service status output
  • Ensure the URL ends with /v1: http://localhost:5273/v1

"No model found" at startup

  • The application auto-detects the model. Ensure at least one model is loaded: foundry service ps
  • If no models are loaded: foundry model run phi-4-mini
  • If you overrode the model name in application.properties, verify it matches foundry model list

"400 Bad Request" errors

  • Verify the base URL includes /v1: http://localhost:5273/v1
  • Ensure you're using maxCompletionTokens() in your code (not the deprecated maxTokens())

Maven compilation errors

  • Ensure Java 21 or higher: java -version
  • Clean and rebuild: mvn clean compile
  • Check internet connection for dependency downloads

Service connection problems

  • If you see Request to local service failed, run: foundry service restart
  • Check loaded models: foundry service ps
  • View service logs: foundry service diag