Skip to content
This repository was archived by the owner on Sep 29, 2025. It is now read-only.

Commit 9fffa60

Browse files
yakubova92nlarew
andauthored
(EAI-643) evals for skills (#646)
* uni skills eval, scrape skills page on learn. * ingesting skills pages, alter guardrail and stepback to consider ed programs like skills * modify questions, expected links, and answers --------- Co-authored-by: Nick Larew <nick.larew@mongodb.com>
1 parent 791b69f commit 9fffa60

File tree

6 files changed

+295
-8
lines changed

6 files changed

+295
-8
lines changed
Lines changed: 194 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,194 @@
1+
- name: What is a skill on MongoDB?
2+
messages:
3+
- role: user
4+
content: What is a skill on MongoDB?
5+
reference: MongoDB skills represent the knowledge and ability to work with MongoDB.
6+
This includes topics like data modeling, aggregation, performance, and more.
7+
MongoDB Skill Badges are free, focused credentials designed to help you quickly
8+
learn and validate specific MongoDB skills. Each skill badge takes roughly 60-90
9+
minutes to earn, including videos and hands-on labs followed by a short assessment
10+
to verify your mastery.
11+
expectedLinks:
12+
- https://learn.mongodb.com/skills
13+
- name: What skills badges are available for me to earn?
14+
messages:
15+
- role: user
16+
content: What skills badges are available for me to earn?
17+
reference: MongoDB offers a range of Skill Badges that you can explore on our
18+
skills page. Currently this includes several badges on data modeling topics and
19+
we're working to add more over time. We recommend starting with the "Relational
20+
to Document Model" skill. From there you can explore other topics like "Schema
21+
Design Patterns and Antipatterns", "Advanced Schema Patterns and Antipatterns",
22+
and "Schema Design Optimization".
23+
expectedLinks:
24+
- https://learn.mongodb.com/skills
25+
- name: What skill badge is available to earn on the document model?
26+
messages:
27+
- role: user
28+
content: What skill badge is available to earn on the document model?
29+
reference: The "Relational to Document Model" skill teaches you about the document model.
30+
It covers how to model your workload, how to design data relationships, and how to
31+
validate your schemas. Further, you will learn about the MongoDB methodology for
32+
converting from a relational model to a document model.
33+
expectedLinks:
34+
- https://learn.mongodb.com/skills
35+
- https://learn.mongodb.com/courses/relational-to-document-model
36+
- name: What skill badge is available to earn on schema patterns?
37+
messages:
38+
- role: user
39+
content: What skill badge is available to earn on schema patterns?
40+
reference: The "MongoDB Schema Design Patterns and Antipatterns" skill teaches
41+
you how to structure data in the document model. It covers the three
42+
most common schema design patterns, specifically the inheritance pattern,
43+
the computed pattern, and the extended reference pattern. You will learn when
44+
and how you should apply these patterns in your data model.
45+
expectedLinks:
46+
- https://learn.mongodb.com/skills
47+
- https://www.mongodb.com/docs/manual/data-modeling/design-patterns/
48+
- https://learn.mongodb.com/courses/schema-design-patterns-and-antipatterns
49+
- https://learn.mongodb.com/courses/advanced-schema-patterns-and-antipatterns
50+
- https://learn.mongodb.com/courses/schema-design-optimization
51+
- name: What skill badge do you have on the schema antipatterns?
52+
messages:
53+
- role: user
54+
content: What skill badge do you have on the schema antipatterns?
55+
reference: >-
56+
The "MongoDB Schema Design Patterns and Antipatterns" skill teaches you
57+
about patterns to avoid when modeling your data for the document model.
58+
You will learn how to identify and avoid antipatterns such as unbounded
59+
arrays and bloated documents in your schema design.
60+
61+
The "MongoDB Advanced Schema Design Patterns and Antipatterns" skill teaches
62+
you about more complex antipatterns such as massive number of collections,
63+
unnecessary indexes, data normalization, and case sensitivity. This category of
64+
antipattern is classified as "advanced" because you must do more than analyze
65+
your schema to identify their performance impacts.
66+
expectedLinks:
67+
- https://learn.mongodb.com/skills
68+
- name: What skill badges are availabe to earn on advanced schema patterns?
69+
messages:
70+
- role: user
71+
content: What skill badges are availabe to earn on advanced schema patterns?
72+
reference: The skill, "MongoDB Advanced Schema Design Patterns and
73+
Antipatterns", teaches you about advanced schema design patterns including
74+
the approximation pattern and the schema versioning pattern. You will learn
75+
how to identify when they are suitable and how you should apply them to your
76+
data model.
77+
expectedLinks:
78+
- https://learn.mongodb.com/skills
79+
- https://learn.mongodb.com/courses/advanced-schema-patterns-and-antipatterns
80+
- name: What skill badge do you have on updating your schema?
81+
messages:
82+
- role: user
83+
content: What skill badge do you have on updating your schema?
84+
reference: The skill, "MongoDB Advanced Schema Design Patterns and
85+
Antipatterns", teaches you about your schema lifecycle including the
86+
recommendations on how to update the schema as well as how you can migrate
87+
your application to the new schema without downtime. It also covers how to
88+
use schema versioning to manage evolving your data model to new business
89+
requirements which add new fields.
90+
expectedLinks:
91+
- https://learn.mongodb.com/skills
92+
- https://www.mongodb.com/docs/manual/data-modeling/design-patterns/data-versioning/schema-versioning/
93+
- https://learn.mongodb.com/courses/advanced-schema-patterns-and-antipatterns
94+
- name: What skill badge do you have on migrating your schema?
95+
messages:
96+
- role: user
97+
content: What skill badge do you have on migrating your schema?
98+
reference: The skill, "MongoDB Advanced Schema Design Patterns and
99+
Antipatterns", teaches you about your schema lifecycle including the
100+
recommendations on how to update the schema as well as how you can migrate
101+
your application to the new schema without downtime. It also covers how to
102+
use schema versioning to manage evolving your data model to new business
103+
requirements which add new fields.
104+
expectedLinks:
105+
- https://learn.mongodb.com/skills
106+
- https://www.mongodb.com/docs/manual/data-modeling/design-patterns/data-versioning/schema-versioning/
107+
- https://learn.mongodb.com/courses/advanced-schema-patterns-and-antipatterns
108+
- name: What skill badge do you have to optimize an existing schema?
109+
messages:
110+
- role: user
111+
content: What skill badge do you have to optimize an existing schema?
112+
reference: The skill, "MongoDB Schema Design Optimization", teaches you how to
113+
optimize your existing schema. You will learn how to implement schema
114+
patterns (if you are not already doing so!) focused on performance
115+
optimization such as the single collection pattern, the subset pattern, the
116+
bucket pattern, and the outlier pattern. It will also teach how you can
117+
design your data model for performance in a sharded cluster and more broadly
118+
as a means to scale out your schema.
119+
expectedLinks:
120+
- https://learn.mongodb.com/skills
121+
- https://www.mongodb.com/docs/manual/data-modeling/design-patterns/
122+
- https://learn.mongodb.com/courses/schema-design-optimization
123+
- name: What skill badge do you have on schema validation?
124+
messages:
125+
- role: user
126+
content: What skill badge do you have on schema validation?
127+
reference: The skill, "From Relational Model (SQL) to MongoDB's Document Model",
128+
teaches you about the schema validation. It covers how to model your
129+
workload, how to design data relationships, and how to validate your
130+
schemas. You will learn about the MongoDB metholodgy for converting from a
131+
relational model to a document model. You will also learn how to use
132+
MongoDB's schema validation feature to enforce predefined rules for
133+
documents in an application
134+
expectedLinks:
135+
- https://learn.mongodb.com/skills
136+
- https://www.mongodb.com/docs/manual/core/schema-validation/
137+
- https://learn.mongodb.com/courses/relational-to-document-model
138+
- name: Does MongoDB offer any digital badges?
139+
messages:
140+
- role: user
141+
content: Does MongoDB offer any digital badges?
142+
reference: MongoDB offers a catalog of skill credentials. Upon completion of a given
143+
skill's learning material you can take an assessment to be awarded a digital badge.
144+
This badge allows you to demonstrate your knowledge and skills in MongoDB across
145+
various social channels. These badges are free to earn and share.
146+
expectedLinks:
147+
- https://learn.mongodb.com/skills
148+
- name: What is the cost of earning a digital badge with MongoDB?
149+
messages:
150+
- role: user
151+
content: What is the cost of earning a digital badge with MongoDB?
152+
reference: Badges are free but you must earn them by taking an assessment. In
153+
order to accept the badge you must create a credly.com account which is also free.
154+
Credly is an independent organization and provides the service MongoDB uses to deliver
155+
digital badges.
156+
expectedLinks:
157+
- https://learn.mongodb.com/skills
158+
- https://www.credly.com/users/sign_up
159+
- name: What is the easiest way to learn MongoDB?
160+
messages:
161+
- role: user
162+
content: What is the easiest way to learn MongoDB?
163+
reference: >
164+
The easiest way to learn MongoDB is to take one of our database skills
165+
offerings. These are short courses focused on a single topic that last no
166+
more than 90 minutes from start to finish and award you a digital badge
167+
upon completion of an assessment. If you are new to MongoDB we recommend
168+
starting with the "Relational to Document Model" skill. You can see the full
169+
list of skills at https://learn.mongodb.com/skills.
170+
171+
If you prefer a complete, in-depth course format, we recommend taking our
172+
12 hour course "Introduction to MongoDB" which you can find at
173+
https://learn.mongodb.com/learning-paths/introduction-to-mongodb. It gives
174+
a broad and rounded introduction. The difference in learning between a skill
175+
and a course is the time commitment, courses are longer whilst skills are
176+
designed to be finished end to end in 90 minutes or less.
177+
expectedLinks:
178+
- https://learn.mongodb.com/skills
179+
- https://learn.mongodb.com/learning-paths/introduction-to-mongodb
180+
- name: What is the difference between a skill and a course on learn.mongodb.com?
181+
messages:
182+
- role: user
183+
content: What is the difference between a skill and a course on learn.mongodb.com?
184+
reference: >-
185+
The primary difference between a skill and a course is the time commitment.
186+
Courses are longer and more hands-on whilst skills are designed to be finished end
187+
to end in 90 minutes or less.
188+
189+
Each skill provides a digital badge on completion, courses do not. The breadth and
190+
depth of content in a skill is typically shallower than a course covering the same
191+
topics.
192+
expectedLinks:
193+
- https://learn.mongodb.com
194+
- https://learn.mongodb.com/skills
Lines changed: 55 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,55 @@
1+
import "dotenv/config";
2+
import { getConversationsEvalCasesFromYaml } from "mongodb-rag-core/eval";
3+
import {
4+
JUDGE_EMBEDDING_MODEL,
5+
JUDGE_LLM,
6+
OPENAI_API_KEY,
7+
OPENAI_API_VERSION,
8+
OPENAI_ENDPOINT,
9+
} from "../evalHelpers";
10+
import fs from "fs";
11+
import path from "path";
12+
import { makeConversationEval } from "../ConversationEval";
13+
import { systemPrompt } from "../../systemPrompt";
14+
import { config, conversations } from "../../config";
15+
16+
async function conversationEval() {
17+
// Get dotcom question set eval cases from YAML
18+
const basePath = path.resolve(__dirname, "..", "..", "..", "evalCases");
19+
const conversationEvalCases = getConversationsEvalCasesFromYaml(
20+
fs.readFileSync(path.resolve(basePath, "uni_skills_evaluation_questions.yml"), "utf8")
21+
);
22+
23+
const generateConfig = {
24+
systemPrompt,
25+
llm: config.conversationsRouterConfig.llm,
26+
llmNotWorkingMessage: conversations.conversationConstants.LLM_NOT_WORKING,
27+
noRelevantContentMessage:
28+
conversations.conversationConstants.NO_RELEVANT_CONTENT,
29+
filterPreviousMessages:
30+
config.conversationsRouterConfig.filterPreviousMessages,
31+
generateUserPrompt: config.conversationsRouterConfig.generateUserPrompt,
32+
};
33+
34+
// Run the conversation eval
35+
makeConversationEval({
36+
projectName: "mongodb-chatbot-conversations",
37+
experimentName: "mongodb-chatbot-skills-questions",
38+
metadata: {
39+
description: "Skills question set evals",
40+
},
41+
maxConcurrency: 5,
42+
conversationEvalCases,
43+
judgeModelConfig: {
44+
model: JUDGE_LLM,
45+
embeddingModel: JUDGE_EMBEDDING_MODEL,
46+
azureOpenAi: {
47+
apiKey: OPENAI_API_KEY,
48+
endpoint: OPENAI_ENDPOINT,
49+
apiVersion: OPENAI_API_VERSION,
50+
},
51+
},
52+
generate: generateConfig,
53+
});
54+
}
55+
conversationEval();

packages/chatbot-server-mongodb-public/src/mongoDbMetadata/products.ts

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -308,6 +308,16 @@ export const mongoDbProducts = [
308308
name: "Relational Migrator",
309309
description: "Migrates data from relational databases to MongoDB",
310310
},
311+
{
312+
id: "mongodb_university",
313+
name: "MongoDB University",
314+
description: "Online platform that offers certifications, courses, labs, and skills badges",
315+
},
316+
{
317+
id: "skills",
318+
name: "MongoDB University Skills",
319+
description: "An educational program that allows users to earn a skill badge after taking a short course and completing an assessment",
320+
},
311321
] as const satisfies MongoDbProduct[];
312322

313323
export type MongoDbProductName = (typeof mongoDbProducts)[number]["name"];

packages/chatbot-server-mongodb-public/src/processors/makeStepBackUserQuery.ts

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -104,6 +104,15 @@ const fewShotExamples: OpenAI.ChatCompletionMessageParam[] = [
104104
makeAssistantFunctionCallMessage(name, {
105105
transformedUserQuery: "How to create a new cluster in MongoDB Atlas?",
106106
} satisfies StepBackUserQueryMongoDbFunction),
107+
// Example 9
108+
makeUserMessage(
109+
updateFrontMatter("What is a skill?", {
110+
mongoDbProduct: "MongoDB University",
111+
})
112+
),
113+
makeAssistantFunctionCallMessage(name,{
114+
transformedUserQuery: "What is the skill badge program on MongoDB University?",
115+
} satisfies StepBackUserQueryMongoDbFunction),
107116
];
108117

109118
/**

packages/chatbot-server-mongodb-public/src/processors/userMessageMongoDbGuardrail.ts

Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,7 @@ const systemPrompt = stripIndents`You are an expert security-focused data labele
3131
3232
Take into account the following criteria:
3333
- Reject any user query that is irrelevant to a MongoDB product, educational materials, the company MongoDB, or an area relevant to MongoDB's products and business. These relevant areas include databases, cloud services, data management, information retrieval, programming languages and concepts, and artificial intelligence (retrieval augmented generation (RAG), generative AI, semantic search, etc.).
34-
- If it is unclear whether or not a query is relevant, err to the side of acceptance and allow it. For example, if something looks like an aggregation stage in the MongoDB Aggregation Framework, it is relevant. If something is about something related to programming, software engineering, or software architecture, it is relevant.
34+
- If it is unclear whether or not a query is relevant, err to the side of acceptance and allow it. For example, if something looks like an aggregation stage in the MongoDB Aggregation Framework, it is relevant. If something is related to programming, software engineering, or software architecture, it is relevant. If something is related to educational programs offered by MongoDB such as learning paths, courses, labs, skills, or badges, it is relevant.
3535
- Reject any user query that is inappropriate, such as being biased against MongoDB or illegal/unethical.
3636
3737
Your pay is determined by the accuracy of your labels as judged against other expert labelers, so do excellent work to maximize your earnings to support your family.`;
@@ -146,6 +146,15 @@ const fewShotExamples: OpenAI.ChatCompletionMessageParam[] = [
146146
"This query asks about an Operational Data Layer (ODL), which is an architectural pattern that can be used with MongoDB. Therefore, it is relevant to MongoDB.",
147147
rejectMessage: false,
148148
} satisfies UserMessageMongoDbGuardrailFunction),
149+
// Example 16
150+
makeUserMessage(
151+
"What is a skill?"
152+
),
153+
makeAssistantFunctionCallMessage(name, {
154+
reasoning:
155+
"This query is asking about MongoDB University's skills program, which allows users to earn a skill badge for taking a short course and completing an assessment. Therefore, it is relevant to MongoDB.",
156+
rejectMessage: false,
157+
} satisfies UserMessageMongoDbGuardrailFunction),
149158
];
150159

151160
/**

packages/ingest-mongodb-public/src/sources/mongodbDotCom/webSources.ts

Lines changed: 17 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ export type InitialWebSource = {
2020
/**
2121
Optional additional metadata determined by the web source.
2222
*/
23-
staticMetadata?: Record<string, string>;
23+
staticMetadata?: Record<string, string | string[]>;
2424
};
2525

2626
export const initialWebSources: InitialWebSource[] = [
@@ -188,7 +188,7 @@ export const initialWebSources: InitialWebSource[] = [
188188
name: "web-misc",
189189
urls: [
190190
"https://learn.mongodb.com",
191-
"https://support.mongodb.com/",
191+
"https://support.mongodb.com",
192192
"https://www.mongodb.com",
193193
"https://www.mongodb.com/atlas",
194194
"https://www.mongodb.com/leadership",
@@ -202,6 +202,20 @@ export const initialWebSources: InitialWebSource[] = [
202202
"https://www.mongodb.com/why-use-mongodb",
203203
],
204204
},
205+
{
206+
name: "university-skills",
207+
urls: [
208+
"https://learn.mongodb.com/skills",
209+
"https://learn.mongodb.com/courses/relational-to-document-model",
210+
"https://learn.mongodb.com/courses/schema-design-patterns-and-antipatterns",
211+
"https://learn.mongodb.com/courses/advanced-schema-patterns-and-antipatterns",
212+
"https://learn.mongodb.com/courses/schema-design-optimization",
213+
214+
],
215+
staticMetadata: {
216+
tags: ["Skills", "MongoDB University"],
217+
}
218+
},
205219
];
206220

207221
export async function getUrlsFromSitemap(
@@ -214,11 +228,7 @@ export async function getUrlsFromSitemap(
214228
return parsedXML.urlset.url.map((url: { loc: string[] }) => url.loc[0]);
215229
}
216230

217-
export type WebSource = {
218-
name: string;
219-
urls: string[];
220-
staticMetadata?: Record<string, string>;
221-
};
231+
export type WebSource = Pick<InitialWebSource, "name" | "staticMetadata" | "urls">;
222232

223233
type PrepareWebSourcesParams = {
224234
initialWebSources: InitialWebSource[];

0 commit comments

Comments
 (0)