In this episode, Ted sits down with Jan Van Hoecke, VP of Product Management and AI Services at iManage, to discuss the evolving role of AI in legal technology and knowledge management. From his early work in AI engineering to the current challenges facing law firms, Jan shares his expertise in building practical AI solutions across industries. With thoughtful insights on why legal teams lag in AI adoption and how better data can drive smarter automation, this conversation offers actionable ideas for legal professionals navigating the next wave of innovation.
In this episode, Jan Van Hoecke shares insights on how to:
Understand the differences in AI adoption between legal and engineering fields
Approach the alignment problem and limitations of today’s LLMs
Improve data hygiene and leverage hybrid search for better knowledge retrieval
Rethink the partnership model to support true R&D in law firms
Use AI to enhance work satisfaction and client outcomes in legal services
Key takeaways:
AI adoption in law is slower due to cultural, structural, and risk-related barriers
High-quality, well-structured data is essential for effective AI implementation
Hybrid search—combining semantic and traditional methods—offers powerful legal search capabilities
Law firms must invest in R&D and experimentation to avoid falling behind
AI can improve legal professionals’ job fulfillment and work-life balance when used thoughtfully
About the guest, Jan Van Hoecke
Jan Van Hoecke is a seasoned AI engineer and product leader with a deep passion for Natural Language Processing and innovation. As a founder of a pioneering legal tech company and now VP of Product Management and AI Services at iManage, he brings a unique blend of hands-on technical expertise and strategic vision to transforming the legal industry through AI. Jan is driven by a belief in technology’s power to create meaningful, lasting change.
As an engineer, I look at what is at our disposal in technology and what we can do with it. But also I’ve got this kind of science hat. And the science hat is more about where we are moving towards in the longer future.
1
00:00:02,348 --> 00:00:06,485
Jan, how are you this afternoon, or I guess this evening, your time?
2
00:00:06,485 --> 00:00:07,401
It's evening.
3
00:00:07,401 --> 00:00:07,924
Good.
4
00:00:07,924 --> 00:00:08,556
Thanks, Dad.
5
00:00:08,556 --> 00:00:09,710
Thanks for having me.
6
00:00:09,710 --> 00:00:11,630
Yeah, I'm excited about the conversation.
7
00:00:11,630 --> 00:00:17,730
We've been trying to get this scheduled for a while, so I'm glad we're actually making it
happen.
8
00:00:18,830 --> 00:00:23,450
Why don't we get you introduced for the folks that don't know you?
9
00:00:23,850 --> 00:00:26,670
You've been around for quite some time.
10
00:00:26,670 --> 00:00:27,850
You're now at iManage.
11
00:00:27,850 --> 00:00:29,570
You were formerly at Raven.
12
00:00:29,650 --> 00:00:32,230
You were even all the way back in the autonomy days.
13
00:00:32,230 --> 00:00:36,870
But why you tell everybody about your background and what you're up to today?
14
00:00:36,919 --> 00:00:45,355
I guess if I go way back then, I studied as an engineer em and specifically did AI at uni
and that's quite a while ago.
15
00:00:45,355 --> 00:00:47,426
That was my second one.
16
00:00:47,426 --> 00:00:50,368
did chip design, hardware design first.
17
00:00:50,368 --> 00:00:59,985
Then I moved into AI research with the Steel company and that's where I decided that I
should probably leave Steel behind.
18
00:00:59,985 --> 00:01:03,036
Still, I regret that it was an amazing company to work for.
19
00:01:03,259 --> 00:01:05,098
I joined autonomy.
20
00:01:05,118 --> 00:01:06,433
That's exactly it.
21
00:01:06,433 --> 00:01:07,263
You're right there.
22
00:01:07,263 --> 00:01:09,435
So worked in enterprise search for quite a while.
23
00:01:09,435 --> 00:01:14,639
And then we decided with a couple of us to leave autonomy behind and, and start Raven.
24
00:01:14,639 --> 00:01:18,491
So I was one of the co-founders of Raven was CTO there for seven years.
25
00:01:18,671 --> 00:01:22,103
And we got acquired by IMAGE in 2017.
26
00:01:22,694 --> 00:01:30,589
And I stayed in engineering positions and now VP of product management to still product
positions for all my term at IMAGE.
27
00:01:30,589 --> 00:01:33,541
It's also been seven years by the way.
28
00:01:33,541 --> 00:01:36,263
And yeah, main mission has been to.
29
00:01:36,951 --> 00:01:43,826
build out an AI team, bring AI to the cloud and get it embedded into the products of the
image portfolio.
30
00:01:43,826 --> 00:01:45,177
That's really been my role.
31
00:01:45,177 --> 00:01:47,078
em Yeah.
32
00:01:47,078 --> 00:01:54,904
And I guess just to maybe like summarize it, I guess I've been wearing an engineering hat
for most of my career.
33
00:01:54,904 --> 00:02:00,088
So as an engineer, I look at what is at our disposal in technology and what can we do with
it.
34
00:02:00,088 --> 00:02:03,440
But also I've got this kind of science hat, right?
35
00:02:03,440 --> 00:02:05,161
And the science hat is more about.
36
00:02:05,357 --> 00:02:07,210
Where are we moving towards in the longer future?
37
00:02:07,210 --> 00:02:08,461
Where is this trending?
38
00:02:08,461 --> 00:02:11,507
And the timeframes are slightly different.
39
00:02:11,507 --> 00:02:15,261
I think it's months and a couple of years for engineering.
40
00:02:15,261 --> 00:02:19,937
It's more of longer, many years for us as scientists that I look at things.
41
00:02:20,406 --> 00:02:20,887
Interesting.
42
00:02:20,887 --> 00:02:23,380
Well, you were way ahead of the curve on AI.
43
00:02:23,380 --> 00:02:33,131
What was it that drove you in that direction, you know, so early on when AI was still kind
of somewhat niche?
44
00:02:33,195 --> 00:02:34,195
Yeah, it was.
45
00:02:34,195 --> 00:02:39,087
mean, it's definitely pre all the connectionist's model as we, as we call it, right?
46
00:02:39,087 --> 00:02:40,737
The connections is the neural network.
47
00:02:40,737 --> 00:02:44,438
So when I got into it, was before that time.
48
00:02:44,538 --> 00:02:47,359
was just this, just this.
49
00:02:47,999 --> 00:02:56,351
fact that on the one hand, intelligence and consciousness is something that really
interests me a lot.
50
00:02:56,351 --> 00:02:59,952
in the, you know, the fact that it just emerges into the world.
51
00:02:59,952 --> 00:03:03,223
And then secondly, that there's this field of IT, which is
52
00:03:03,299 --> 00:03:04,679
pursuing this, right?
53
00:03:04,679 --> 00:03:12,719
It's on the one hand, trying to investigate and explain what our intelligence is all about
and our reasoning processes are all about.
54
00:03:12,719 --> 00:03:22,579
On the other hand, it's also bringing these technologies then to the field of our
practical applications, embedding it into products and making things happen with it.
55
00:03:22,639 --> 00:03:31,489
And this fact that you could make machines behave in a semi or seemingly intelligent way
is something that I always like.
56
00:03:31,489 --> 00:03:34,971
That's why I picked up the study and I've always stuck with it.
57
00:03:35,360 --> 00:03:38,530
And when did you actually get involved into the field?
58
00:03:38,530 --> 00:03:39,733
Like what year?
59
00:03:42,614 --> 00:03:45,095
2001 I think is when I graduated.
60
00:03:45,315 --> 00:03:47,739
it's been a while.
61
00:03:47,886 --> 00:03:55,283
Yeah, I mean, that was so Watson on Jeopardy was in the 90s, right?
62
00:03:55,283 --> 00:04:01,805
Yeah, and we had the chess computer before, they were just deep search models, right, as
you call it.
63
00:04:01,865 --> 00:04:10,867
And then we had the, specialty was support vector machines, which kind of went out of
fashion as neural networks stepped in.
64
00:04:11,147 --> 00:04:19,830
And I worked on trying to do, for instance, corrosion detection, the type of corrosion on
steel plates, because it was a steel company, right?
65
00:04:19,830 --> 00:04:24,171
And so we kind of had a guy who
66
00:04:24,171 --> 00:04:29,653
He evaluated steel plates by looking at it and said like, it's 10 % corroded by this type
of corrosion.
67
00:04:29,653 --> 00:04:37,256
And then we built training sets and SVM to train on them and to completely make his job
redundant.
68
00:04:37,256 --> 00:04:41,418
He liked it because he, I mean, he liked being made redundant for that full task.
69
00:04:41,418 --> 00:04:44,819
That was not the joy of his day, let's say.
70
00:04:44,942 --> 00:04:56,622
Yeah, well, yeah, I mean, so I paid attention during those early years when I started my
technology journey very early, fifth grade.
71
00:04:57,002 --> 00:04:59,722
So this would have been 1982.
72
00:04:59,882 --> 00:05:08,262
got a Texas Instruments 994A personal computer, an extended basic cartridge, and a book
about
73
00:05:08,312 --> 00:05:12,764
two and a half inches thick that just had all the syntax of the different commands.
74
00:05:12,764 --> 00:05:19,106
And I mean, I was 10 years old and I was totally geeking out on this and building little
programs.
75
00:05:19,106 --> 00:05:24,289
I remember I built an asteroid program where basically the asteroids didn't move.
76
00:05:24,289 --> 00:05:29,856
I wasn't that sophisticated, but you could navigate a little spaceship across the static
asteroid field.
77
00:05:29,856 --> 00:05:37,974
But you know, I 10 years old and then I got out of it in high school because chicks don't
want to talk to
78
00:05:38,286 --> 00:05:54,546
guys so I stepped away and then found it again back after college when the you know so
many things had changed so much but you know AI really kind of hit my radar it was the
79
00:05:54,546 --> 00:06:07,430
AlphaGo you know that was like the moment like wow but you know since then I've been you
know chat GPT
80
00:06:07,490 --> 00:06:09,822
and oh all these new capabilities.
81
00:06:09,822 --> 00:06:12,515
I'm spending a lot of time there.
82
00:06:12,515 --> 00:06:18,740
And I'm finding a lot of amazing efficiencies.
83
00:06:18,781 --> 00:06:26,087
You saw the agenda that I put together for us that was an output of we had a conversation
on a planning call.
84
00:06:26,087 --> 00:06:33,642
I took the transcript, it into a custom-clawed project with examples in its training
materials and custom instructions.
85
00:06:33,642 --> 00:06:39,186
and that used to take me, I used to have to go back and listen to the recording again and
take notes.
86
00:06:39,307 --> 00:06:45,452
So it would be a 30 minutes on the call, then another 30 minutes at least to listen and
get all the details.
87
00:06:45,452 --> 00:06:48,334
And now it takes me about three minutes.
88
00:06:48,669 --> 00:06:58,457
So these, mean, coming to this topic of the efficiencies, I actually went out and looked a
little bit because like one of the things I've been fascinated about is how does like a
89
00:06:58,457 --> 00:07:03,341
knowledge industry like legal compared to other knowledge industries, for instance,
engineering, right?
90
00:07:03,341 --> 00:07:12,539
So how do they, why is it then the engineers treat themselves to better tools sometimes
than the legal workers to make their life easier?
91
00:07:12,539 --> 00:07:17,653
So I started looking for data to back this up specifically then in the AI land.
92
00:07:17,795 --> 00:07:21,806
So I found this study was done by GitHub and it's on their own product, right?
93
00:07:21,806 --> 00:07:27,968
On copilot, GitHub copilot, which is probably not the thing you just take as a scientific
research paper, right?
94
00:07:27,968 --> 00:07:29,688
Because it's on their own stuff.
95
00:07:29,688 --> 00:07:42,462
But they did say that when they rolled it out to an organization that they have like 95 %
adoption on the same day by every user, practically every user starts using it.
96
00:07:42,582 --> 00:07:46,383
And then they get to what does it actually help them with?
97
00:07:46,903 --> 00:07:52,205
they claimed that it was a 55 % time saved on coding tasks.
98
00:07:52,986 --> 00:07:57,988
But I don't know if that's actually backed by real data or it was the perception of the
people.
99
00:07:57,988 --> 00:08:01,869
And one of the metrics I track is published by METER.
100
00:08:01,869 --> 00:08:14,755
I don't know if you know METER, but METER just published a report a couple of days ago on
how AI helps open source developers in there, how it speeds them up and how much they
101
00:08:14,755 --> 00:08:15,459
think in...
102
00:08:15,459 --> 00:08:18,279
advance it will speed them up and then how much it actually did.
103
00:08:18,279 --> 00:08:30,499
What they found is that, but they think about, they hope for 20%, 30 % speed up, but they
suffer from a 12 % slowdown when using AI, which kind of really baffled me.
104
00:08:30,499 --> 00:08:34,579
That's very contradictory to what the Copilot people were saying.
105
00:08:34,979 --> 00:08:44,621
Maybe the most interesting one was that, and that one I believe, is that from the IT
developers who use an AI assistant encoding is that 90 %
106
00:08:44,621 --> 00:08:47,833
felt more fulfilled in their job.
107
00:08:47,954 --> 00:09:00,414
And that's, know, if anything else, that is something that I would be interested in,
especially because TR did some survey and they found that the number one thing that legal
108
00:09:00,414 --> 00:09:02,866
workers want to improve is their work-life balance.
109
00:09:02,866 --> 00:09:07,990
So if fulfillment is something that can bring them and make them happier, then at least
it's that.
110
00:09:08,951 --> 00:09:13,515
But yeah, I think it's been slower in the uptake and legal, but it's also not happening.
111
00:09:13,515 --> 00:09:14,381
Maybe...
112
00:09:14,381 --> 00:09:26,073
three, five, three years ago, definitely in the Raven days, we could claim like, there's
always the skepticism and lack of trust and I think that's with the, know, Chat GPTs and
113
00:09:26,073 --> 00:09:30,057
the LLMs that has changed or is changing and has already changed.
114
00:09:30,722 --> 00:09:38,464
Yeah, know, uh Ethan Malik talks a lot about kind of the jagged edge of AI in terms of
capabilities.
115
00:09:38,464 --> 00:09:44,306
And, you know, I noticed that, so my coding skills are largely out of date other than SQL.
116
00:09:44,306 --> 00:09:49,347
um I was on the SQL team at Microsoft many years ago and SQL hasn't changed much.
117
00:09:49,347 --> 00:09:59,530
um So um I'm able to still do some things in there and I do from time to time, you know,
analyze data and whatnot.
118
00:09:59,530 --> 00:10:10,275
And I have noticed a very um high degree of variation in terms of even from really good
models like Claude on for coding.
119
00:10:10,275 --> 00:10:22,620
Like just yesterday, I tried to, uh downloaded a little freeware app called Auto Hotkey
and, you know, trying to be more efficient m and a common snippets.
120
00:10:22,620 --> 00:10:28,122
would, and I had, I had Claude write me a script and it took me like,
121
00:10:28,686 --> 00:10:32,126
It took me like five times to iterate through it for it to get it right.
122
00:10:32,126 --> 00:10:40,086
You know, the first time it did it on the previous version of Auto Hotkey, you know, you
didn't, and now the syntax is a little different.
123
00:10:40,106 --> 00:10:49,486
Then it, you know, I was basically having it control, pay a control V, uh, paste into an
app and it would only paste part of the string.
124
00:10:49,486 --> 00:10:50,986
And then I had to ask it why.
125
00:10:50,986 --> 00:10:57,998
And then it, you know, I basically had to put a little timer delay in there to get it to
pace the full string before it.
126
00:10:57,998 --> 00:11:00,078
terminated the thread, I guess.
127
00:11:00,498 --> 00:11:07,478
then on other scenarios like SQL, if I have, let's say, a little access database, I'll
pull some data down.
128
00:11:07,478 --> 00:11:22,418
If I don't want to mess with SQL, and I'll export the database schema into PDFs, upload it
into an LLM, and ask it to write a query that will require me to go search for syntax,
129
00:11:22,418 --> 00:11:27,896
like a correlated subquery or something that I'm not doing.
130
00:11:27,896 --> 00:11:30,733
frequently and it usually nails it.
131
00:11:31,086 --> 00:11:35,680
I think it's there's that jagged edge concept is real.
132
00:11:35,757 --> 00:11:43,600
mean, some of these shortcomings, let's say, are then picked up, picked on and joked
about.
133
00:11:43,600 --> 00:11:46,882
Like we had this, I don't know if you remember this, strawberry.
134
00:11:46,882 --> 00:11:53,015
Yeah, so why can't they tell me how many Rs are there in the word strawberry?
135
00:11:53,015 --> 00:12:02,099
But then if you actually dig deeper, what happens under the hood is the model never sees
the word strawberry.
136
00:12:02,679 --> 00:12:09,133
You know, what happens is there's a tokenizer and the tokenizer splits the words into
individual subparts.
137
00:12:09,133 --> 00:12:17,459
then though each of those might be straw and berry or bear and re or it might be just one
token, you you don't really know.
138
00:12:17,459 --> 00:12:23,402
But the key thing is that it then converts that into like a numerical vector.
139
00:12:23,402 --> 00:12:25,494
And that's really what the model reasons with.
140
00:12:25,494 --> 00:12:27,957
So for all it.
141
00:12:27,957 --> 00:12:31,488
knows it could be strawberry written in French, which is phrase.
142
00:12:31,488 --> 00:12:34,529
mean, it would be the same vector at sea.
143
00:12:34,529 --> 00:12:39,380
because it never has access to that something we see, which is the word, it couldn't
answer that question.
144
00:12:39,380 --> 00:12:46,332
It could just like probably just look in its memory of things it's seen that is close and
then just try to make an educated guess.
145
00:12:46,332 --> 00:12:48,833
So there's explanations.
146
00:12:48,833 --> 00:12:55,154
And then once you know the explanation, you can work towards solving them as well, of
course.
147
00:12:56,495 --> 00:12:57,235
I guess
148
00:12:57,235 --> 00:13:05,633
One I don't want to distract too much, but one that really fascinates me is the alignment
problem.
149
00:13:05,633 --> 00:13:12,929
And alignment kind of comes down to these LLMs are really very rough gems.
150
00:13:13,530 --> 00:13:16,472
They're language prediction machines.
151
00:13:16,472 --> 00:13:20,015
They've seen a lot of text, like all the text is actually on the internet.
152
00:13:20,015 --> 00:13:24,359
And then what we give them is some input and...
153
00:13:24,705 --> 00:13:27,997
the model needs to complete whatever we've given them.
154
00:13:28,378 --> 00:13:38,046
But, and the way that these big vendors make them do something that's actually valuable to
them is by a second training step, this reinforcement learning.
155
00:13:38,046 --> 00:13:42,870
The one that actually AlphaGo, you know, that's where AlphaGo became famous for the...
156
00:13:42,870 --> 00:13:45,832
So there's this two-phase training process.
157
00:13:45,832 --> 00:13:54,179
On the one hand, these LLMs consume all the text and they have to predict the next word,
just like, you know, the cell phone next word prediction thing works.
158
00:13:54,179 --> 00:14:05,234
And then secondly, to teach them about values or the goals that they should achieve, they
get this reinforcement, the learning.
159
00:14:05,234 --> 00:14:07,985
the reinforcement is kind of like a carrot and a whip.
160
00:14:07,985 --> 00:14:11,336
Like when they get the right answer, then they get a carrot.
161
00:14:11,336 --> 00:14:14,458
And if they don't get the right answer, they get whipped by some human being.
162
00:14:14,458 --> 00:14:16,288
That's essentially what happens, right?
163
00:14:16,789 --> 00:14:21,710
And that's how they get shaped into making sure that they do something useful for us.
164
00:14:22,689 --> 00:14:25,080
And Tropic has looked into that quite a bit.
165
00:14:25,080 --> 00:14:34,722
And what is really fascinating is that it gets, you know, the bigger the model becomes and
the, guess you could say the smarter it becomes, the harder it is to get them aligned with
166
00:14:34,722 --> 00:14:36,143
what we want them to do.
167
00:14:36,143 --> 00:14:39,233
They really try to uh cheat us, right?
168
00:14:39,233 --> 00:14:41,924
That's, they see exactly.
169
00:14:41,924 --> 00:14:44,645
They try, they talk very nice to us.
170
00:14:44,645 --> 00:14:46,766
They, they think like we're the best.
171
00:14:46,766 --> 00:14:52,557
That's, know, and they, but more importantly, I guess more scientifically is if you give
them a coding test.
172
00:14:52,557 --> 00:14:54,238
they tried to take shortcuts.
173
00:14:54,238 --> 00:14:56,888
They don't necessarily write a program that actually works.
174
00:14:56,888 --> 00:15:03,530
They try to write a program that satisfies the test conditions, which is not necessarily
the same thing.
175
00:15:03,830 --> 00:15:06,931
And that's where it gets really fascinating.
176
00:15:06,931 --> 00:15:11,173
You can see this human behavior slipping into them.
177
00:15:11,173 --> 00:15:19,415
And it will be a challenge to keep on, at least with this technology, to keep on making
them useful for us.
178
00:15:19,756 --> 00:15:20,566
Yeah.
179
00:15:20,566 --> 00:15:33,490
Well, you mentioned coding and like how the last time you and I spoke when we were getting
prepared for this episode, we talked about how um the kind of the contrasting approach
180
00:15:33,490 --> 00:15:44,033
between how legal professionals leverage or view AI and software engineers with tools like
GitHub Copilot.
181
00:15:44,033 --> 00:15:47,924
And there's kind of different mindsets, different approaches.
182
00:15:47,924 --> 00:15:49,154
What is your?
183
00:15:49,420 --> 00:15:50,952
What is your take on that?
184
00:15:51,843 --> 00:16:00,669
I there's definitely like a difference in adoption, the difference of adoption that has
been around for a while.
185
00:16:00,829 --> 00:16:04,171
mean, the IT and software world can't be compared to the legal world.
186
00:16:04,171 --> 00:16:14,698
If you look at, I'll just bring up an example that I've mentioned in the past, just to
illustrate how different these industries look at things as the open source movement,
187
00:16:14,698 --> 00:16:14,969
right?
188
00:16:14,969 --> 00:16:17,921
So the open source movement was a big movement.
189
00:16:17,921 --> 00:16:20,142
I guess it goes back to this sixties or seventies.
190
00:16:20,142 --> 00:16:22,013
I don't know exactly when it started.
191
00:16:22,115 --> 00:16:33,195
where some universities and even individuals and companies decided that they would just
throw all their intellectual property in the open and share it with everyone with the
192
00:16:33,195 --> 00:16:43,415
belief that that would actually fast track the entire industry and it would accelerate
them rather than, you know, give all their most valuable assets away.
193
00:16:43,415 --> 00:16:49,635
That is something that's completely unthinkable as a business concept, I think, in the
legal industry.
194
00:16:49,635 --> 00:16:51,917
While maybe it could also fast...
195
00:16:51,917 --> 00:16:54,579
track or uh accelerate or fuel the industry.
196
00:16:54,579 --> 00:16:56,510
We don't really know how that would end.
197
00:16:56,510 --> 00:17:04,175
there was definitely, Microsoft was one of the big fighters against the open source
movement because they thought it was going to ruin everything.
198
00:17:04,415 --> 00:17:06,096
It has changed, of course.
199
00:17:06,217 --> 00:17:08,628
I just wanted to take that up as an example.
200
00:17:08,628 --> 00:17:16,533
So there's definitely a change in attitude and maybe it's risk aversion and probably with
201
00:17:16,631 --> 00:17:29,810
with reason, like the output quality, the risks around data privacy and being exposed as
an individual, like that lawyer that used the 2023, that New York lawyer that wrote the
202
00:17:29,810 --> 00:17:30,701
brief.
203
00:17:30,701 --> 00:17:37,725
that, I mean, no developer really, I think has that same risk that they would get exposed
in this way.
204
00:17:37,825 --> 00:17:40,817
Software gets written and gets double checked by machines.
205
00:17:40,817 --> 00:17:43,089
And of course it has to function before it goes out.
206
00:17:43,089 --> 00:17:46,721
So there's more of a personality around there that matters.
207
00:17:46,947 --> 00:17:49,008
There's a different business model, of course, right?
208
00:17:49,008 --> 00:18:01,431
The billing, then I'm talking about law firms, the billing by the hour model that
definitely doesn't really encourage the super efficiency, which is very different for
209
00:18:01,911 --> 00:18:02,641
corporate legal.
210
00:18:02,641 --> 00:18:12,114
we, by the way, I think even with an image, we see that with our customers, that there's a
difference in attitude and uptake between corporate legal and law firms.
211
00:18:12,694 --> 00:18:15,425
Maybe it's as a personality.
212
00:18:15,567 --> 00:18:17,887
Maybe there's a knowledge gap.
213
00:18:17,887 --> 00:18:31,516
I think we've touched on the fact that there's definitely like an immediate return on
investment mentality versus engineering firms where there's more of an R &D, true R &D.
214
00:18:31,516 --> 00:18:38,981
Like let's the budget aside and let some innovation brew in that budget.
215
00:18:38,981 --> 00:18:43,875
mean, that's just engineering firms have to innovate that way.
216
00:18:43,875 --> 00:18:45,197
to be able to be future-proof.
217
00:18:45,197 --> 00:18:54,197
And I think that's a mentality not really baked into the legal industry, just because
there was never a need for it.
218
00:18:54,540 --> 00:18:54,890
Right.
219
00:18:54,890 --> 00:18:57,452
Yeah, I've written about this quite a bit.
220
00:18:57,452 --> 00:19:00,373
And that's due to a number of factors.
221
00:19:00,373 --> 00:19:09,798
I would say the most uh highly contributing factor in the legal industry to this, how
foreign R &D is, it's the partnership model.
222
00:19:09,958 --> 00:19:16,462
So the partnership model is very much a partnership model that operates on a cash basis.
223
00:19:16,522 --> 00:19:18,823
R &D expenses are accrued.
224
00:19:18,823 --> 00:19:24,192
um Even if your uh tax treatment accelerates that
225
00:19:24,192 --> 00:19:33,048
for tax purposes in general on your internal books, you amortize R &D costs over its
useful life.
226
00:19:33,048 --> 00:19:43,916
um law firm partnerships are very much um about maximizing profits at the end of the year.
227
00:19:43,916 --> 00:19:52,562
And I think that's one of the big hurdles that law firms face when trying to
228
00:19:52,686 --> 00:20:00,446
map their strategy with respect to AI, there's going to be some experimentation and some R
&D that's required.
229
00:20:01,066 --> 00:20:09,986
And focusing too much on immediate ROI, I think is going to limit risk taking and
ultimately hold firms back.
230
00:20:09,986 --> 00:20:12,546
I actually see it every day.
231
00:20:13,806 --> 00:20:19,026
I've done business with about 110 AMLaw firms when I stopped counting.
232
00:20:19,626 --> 00:20:22,046
so I've seen a good cross-sectional view.
233
00:20:22,046 --> 00:20:32,946
I have, talk to firms on a frequent basis where I hear things like we're going to, we're,
going to wait and see because we really can't articulate an ROI today because it's going
234
00:20:32,946 --> 00:20:34,658
to, it's, it's reducing the billable hour.
235
00:20:34,658 --> 00:20:45,047
I would say those firms are more and more starting to be in the minority and most firms
now, especially the big ones get that wait and see is a bad idea.
236
00:20:45,047 --> 00:20:48,343
But yeah, I think the partnership model is a big, a big factor in this.
237
00:20:48,343 --> 00:20:50,884
Well, that's why I was going to ask you, do you think there's change?
238
00:20:50,884 --> 00:20:59,606
Like, because we see ANO with Harvey, like that's definitely some kind of jump into like a
big unknown.
239
00:20:59,806 --> 00:21:07,428
And even in I-Manage, like we see the, for instance, the uptake of Ask I-Manage, which is
our LLM based product.
240
00:21:08,249 --> 00:21:12,870
It's the fastest uptake that we've seen for any of our products before.
241
00:21:12,870 --> 00:21:15,731
And that is firms who want to just...
242
00:21:15,757 --> 00:21:19,670
don't miss out and want to experiment because they're not just buying us.
243
00:21:19,951 --> 00:21:23,434
They're trying different things and seeing what sticks.
244
00:21:23,434 --> 00:21:32,263
And there's quite some in-house initiatives and teams being spun up, at least probably in
the larger law firms that's happening.
245
00:21:32,263 --> 00:21:35,285
uh I would, by the way, definitely encourage that.
246
00:21:35,285 --> 00:21:36,667
So I'm on board with you.
247
00:21:36,667 --> 00:21:41,191
Like, encourage the in-house experiment, set some budget aside for it.
248
00:21:41,869 --> 00:21:46,941
Try different vendors, try software yourself, see what works and don't just write it off.
249
00:21:46,941 --> 00:21:48,612
Like figure out the constraints.
250
00:21:48,612 --> 00:21:49,763
That's really it, right?
251
00:21:49,763 --> 00:21:56,806
These products have certain constraints, figure out what the constraints are, but figure
out within those constraints what you can do with it.
252
00:21:56,946 --> 00:21:58,651
That would be my suggestion.
253
00:21:58,651 --> 00:22:04,935
And it's hard to put in a spreadsheet, the R in the ROI, the return is learning.
254
00:22:05,736 --> 00:22:09,358
And again, that's hard to quantify and put a figure on.
255
00:22:09,358 --> 00:22:18,584
But at the end of the day, if you're not thinking that way, you're going to limit risk
taking.
256
00:22:19,150 --> 00:22:25,229
you're not going to push forward at the pace at which you're going to need to to keep up.
257
00:22:25,229 --> 00:22:27,110
um
258
00:22:27,210 --> 00:22:28,050
in my opinion.
259
00:22:28,050 --> 00:22:37,393
um What about, so, you you in the world of document management, you know, I see a lot of
document management systems.
260
00:22:37,393 --> 00:22:41,464
don't implement, we're partners with iManage for integration purposes.
261
00:22:41,464 --> 00:22:48,736
So in InfoDash, we surface uh iManage content in intranet and extranet scenarios.
262
00:22:48,736 --> 00:22:56,078
um But as a part of that doing that work for the last almost 20 years, I've seen a lot of
law firm DMSs.
263
00:22:56,526 --> 00:22:58,667
And there's very poor data hygiene.
264
00:22:58,667 --> 00:23:14,617
Um, there's been a lot of kind of mergers and acquisitions where you'll get one mess of a
law firms DMS that gets, um, merged into another and they have different, um, different
265
00:23:14,617 --> 00:23:16,477
types of shortcomings.
266
00:23:17,919 --> 00:23:24,102
and it really seems like an overwhelming task for
267
00:23:24,238 --> 00:23:34,958
these law firms to actually straighten that up to, to, and get it to a place where it
makes sense to point AI at a entire DM corpus.
268
00:23:35,098 --> 00:23:36,958
Um, is that your take as well?
269
00:23:36,958 --> 00:23:41,158
mean, it sounds, it feels like you really need a curated data sets.
270
00:23:41,507 --> 00:23:44,927
Well, mean, you definitely take a step back.
271
00:23:44,927 --> 00:23:49,587
You definitely need to do something about the information that you have, right?
272
00:23:49,587 --> 00:24:00,187
mean, legal as an information business, should be, I guess, obvious that managing and
finding that information should be high on the priority list of what you invest in.
273
00:24:00,447 --> 00:24:03,587
That's the simple statement to make.
274
00:24:04,027 --> 00:24:11,317
we definitely very often hear like, can't we throw all those documents that you have in
the DMS and put it in chat GPT and...
275
00:24:11,349 --> 00:24:14,000
and just get amazing results out of it.
276
00:24:14,241 --> 00:24:23,187
that's, I mean, we, hope they're finding out that that doesn't work and everybody kind of,
if you know the technology, that that's not really how it will work.
277
00:24:23,347 --> 00:24:33,594
So getting a good data set is definitely the, I mean, the strategy that as an engineer,
I'll put on my engineering hat is what you need to pursue right now.
278
00:24:33,594 --> 00:24:33,884
Right.
279
00:24:33,884 --> 00:24:40,699
So the, the data that goes in is also the quality of the data that goes in is also the
quality of the data that comes out.
280
00:24:40,699 --> 00:24:41,331
Now.
281
00:24:41,331 --> 00:24:43,992
Search technology has evolved quite a bit.
282
00:24:43,992 --> 00:24:46,913
there's very interesting things that it can do.
283
00:24:46,913 --> 00:24:51,242
mean, there's the AI has brought us the semantic representation.
284
00:24:51,242 --> 00:24:52,534
I mentioned that before, right?
285
00:24:52,534 --> 00:25:00,826
So the words don't get represented as strings anymore, but they get represented by a
mathematical vector that represents the meaning.
286
00:25:00,826 --> 00:25:06,218
We call it the, these embeddings, vector embeddings.
287
00:25:06,218 --> 00:25:11,515
And simply speaking, it makes sure that, like,
288
00:25:11,563 --> 00:25:18,646
force majeure or act of God, very different strings if you look at them, but they are very
close to each other.
289
00:25:18,646 --> 00:25:21,988
Are they exactly the same when you represent them in meaning space?
290
00:25:21,988 --> 00:25:30,311
So we've got this that has helped, but we really need that combined with the traditional
filters so we can have metadata filters.
291
00:25:30,311 --> 00:25:38,804
you say the document should be, I'm looking for something that's written in the last two
years, no meaning vector is going to help you there.
292
00:25:38,804 --> 00:25:40,155
So you need this.
293
00:25:40,155 --> 00:25:43,457
good metadata on it as well.
294
00:25:43,577 --> 00:25:45,608
And we kind of call that hybrid search, right?
295
00:25:45,608 --> 00:25:55,443
So this hybrid search is the joining of the semantic index, which is very interesting,
together with the traditional search index.
296
00:25:55,443 --> 00:25:58,645
And Microsoft has benchmarked that that's the best approach.
297
00:25:58,645 --> 00:26:09,731
If you compare each one individually, pure semantic or pure traditional, you get lower
scores on finding the right information at the right time.
298
00:26:09,847 --> 00:26:13,480
the information you put into it, still the information that will come out of it, right?
299
00:26:13,480 --> 00:26:23,799
So if you put in a document that you would never want anyone to use, it will come out and
if you don't have the right warnings on it, that might, I mean, that might be very
300
00:26:23,799 --> 00:26:24,620
problematic.
301
00:26:24,620 --> 00:26:33,757
But by the way, just digging a little bit deeper on that search, because I kind of like
search, they also found, and I want to give that to you, is they also found that apart
302
00:26:33,757 --> 00:26:38,207
from hybrid search, semantic re-ranking also has
303
00:26:38,207 --> 00:26:39,927
another 10 % uptake.
304
00:26:39,927 --> 00:26:48,410
Semantic re-ranking means that whatever comes back from the search engine, you pass it
over again based on the question that the user has and then change the order.
305
00:26:48,410 --> 00:26:55,171
So you take a look at the top 50 results, instance, and you say, these results are all
good, but this one is actually the one that should be on number one.
306
00:26:55,171 --> 00:27:00,052
I found that really interesting that it has another 10 % uptake in their tests.
307
00:27:00,893 --> 00:27:05,234
And there's technology beyond that like graph, rag, and...
308
00:27:05,731 --> 00:27:10,002
probably don't have time to go into there, but all of these technologies fit really well
with legal.
309
00:27:10,002 --> 00:27:14,213
So law, legal industry should look into it.
310
00:27:14,213 --> 00:27:22,236
How I take it, how you can get to those best information is you should let the system do
as much of the heavy work as possible.
311
00:27:22,456 --> 00:27:30,348
we found out that enriching the data for instance figuring out what doc type you're
dealing with, what contract type or other legal contract type is something machines can do
312
00:27:30,348 --> 00:27:31,619
reliably well.
313
00:27:31,619 --> 00:27:33,645
Pulling out key.
314
00:27:33,645 --> 00:27:42,382
information points like what's the contract date, who are the parties, some other things
that you typically would want to reference in your searches, try to pull it out and put it
315
00:27:42,382 --> 00:27:47,446
as metadata because that's, mean, and again, that's something that can be done by machines
and can be done reliably well.
316
00:27:47,446 --> 00:27:51,969
It's not LLMs doing it, that would be too expensive, but it's doable.
317
00:27:52,390 --> 00:28:03,447
But then you kind of still need to find the gems in there and those gems, I mean, so far
that's still a human process to figure out like this is one of the, you know.
318
00:28:03,447 --> 00:28:07,290
we've closed this deal, these are the gems that we want to keep.
319
00:28:07,290 --> 00:28:15,416
Let's put a label on it and once you've tagged it somehow, the system can know that it
should prefer those to come back.
320
00:28:15,917 --> 00:28:24,483
And that's our strategy, get the data as rich as possible, ensure that you use the AI as
much as possible to enrich it.
321
00:28:24,864 --> 00:28:28,366
We also believe in bringing the AI to the data, right?
322
00:28:28,366 --> 00:28:33,443
So don't pull all the data out, so you lose all the security context, but bring the AI to
it.
323
00:28:33,443 --> 00:28:40,143
It doesn't have to be IMAGE AI, can be other vendors and then bring all of the smarter
search technology on top of it.
324
00:28:40,443 --> 00:28:43,863
But I've said that all of that, that's my engineering hat, right?
325
00:28:43,863 --> 00:28:56,683
If you think about the science hat, then I do have to say that every time that we've
anything, I mean, comparing to symbolic AI, I don't know if you know what symbolic AI is.
326
00:28:56,683 --> 00:29:00,243
We had this symbolic AI where you try to build rule sets for everything, right?
327
00:29:00,243 --> 00:29:02,963
So we had language translation.
328
00:29:03,043 --> 00:29:06,295
10 years, 15 years ago that was done by software with rules.
329
00:29:06,295 --> 00:29:14,869
Essentially they wrote rules of how English should be translated into French and somebody
managed and maintained and curated all those rules.
330
00:29:15,190 --> 00:29:26,316
But then at some point, what we call the connectionist AI, trained a model by looking at
French and English texts and figuring out what those rules were internalized into a model.
331
00:29:26,316 --> 00:29:32,449
And you can't really look at how it does it, but it does it when we do the benchmark, we
see that it does it.
332
00:29:32,471 --> 00:29:35,953
vastly superiorly well than the traditional rule based one.
333
00:29:35,953 --> 00:29:45,660
And that's the same for grammar correction systems, code generation, I guess now we had
code generations before, or transcription or transcription as well.
334
00:29:46,080 --> 00:29:51,624
These sound bytes where then transcribed into words.
335
00:29:51,624 --> 00:29:59,199
So all of these technologies we've seen that the symbolic version has been surproceeded by
connectionist one.
336
00:29:59,199 --> 00:30:01,781
So I'm just saying,
337
00:30:01,781 --> 00:30:04,953
Right now as an engineer and a product manager, that's what we have to do.
338
00:30:04,953 --> 00:30:08,515
We have to really curate those sets, but five years from now, it could be very different.
339
00:30:08,515 --> 00:30:09,896
And I don't know what it will look like.
340
00:30:09,896 --> 00:30:19,022
Maybe it's the machine doing the curation for us, or it just doesn't need it anymore
because it sees, as long as it has all the information to make the determination, it sees
341
00:30:19,022 --> 00:30:20,002
all of it.
342
00:30:20,183 --> 00:30:23,705
But there is a chance, of course, that the connectionist model overtakes it.
343
00:30:23,705 --> 00:30:26,120
em Just...
344
00:30:26,120 --> 00:30:28,306
That's kind of the Elon Musk.
345
00:30:28,306 --> 00:30:32,247
um Yeah, that's his theory as well.
346
00:30:32,247 --> 00:30:37,131
think I'm not as optimistic about timelines as Elon might be.
347
00:30:37,131 --> 00:30:39,833
That's just the feasibility of it.
348
00:30:39,833 --> 00:30:43,315
em You don't really know.
349
00:30:43,956 --> 00:30:57,415
A very interesting benchmark, I'm not a benchmark person, but a very interesting thing to
track is again on meter is the duration of tasks as done by human, the AI can do with high
350
00:30:57,415 --> 00:30:58,446
reliability.
351
00:30:58,446 --> 00:31:00,097
That's maybe a bit.
352
00:31:00,289 --> 00:31:11,662
difficult sentence, essentially means like, so the, how well an AI can do a task which
takes a certain amount of minutes for a human to do, right?
353
00:31:11,662 --> 00:31:14,383
And they track how well, how that's evolving.
354
00:31:14,383 --> 00:31:22,386
So let's say, m me doing a Google query and looking at the result, that's a task that
takes me about 30 seconds, right?
355
00:31:22,386 --> 00:31:27,387
em Replying to an email em or writing a...
356
00:31:27,651 --> 00:31:30,451
one class of code takes me about four minutes.
357
00:31:30,451 --> 00:31:34,551
So you could say this, these are increasingly more complex tasks.
358
00:31:34,791 --> 00:31:38,411
Some tasks take a human an hour to do or take four hours to do.
359
00:31:38,411 --> 00:31:47,191
And what they do is they let the machine or they benchmark how well an LLM or some AI does
performs at these tasks.
360
00:31:47,191 --> 00:31:57,871
So right now they got to the point that six to 10 minutes tasks can be done with high
success rate by LLMs.
361
00:31:57,891 --> 00:32:03,331
And that's been the length of that duration of the task has been doubling every seven
months.
362
00:32:03,331 --> 00:32:12,011
So every seven months, so within seven months, you could expect it to go to tasks that
would take us 15 minutes, but then seven months later, it's 30 minutes, right?
363
00:32:12,011 --> 00:32:22,431
And at some point you kind of have quite an individual or let's say autonomous AI doing
work.
364
00:32:22,811 --> 00:32:26,699
So, I mean, so it's again, it's benchmark and it's a
365
00:32:26,699 --> 00:32:32,832
It's a forecast and you know, you can't trust forecast, but I think it's a very
interesting one that we've been tracking.
366
00:32:32,832 --> 00:32:49,970
that one, I think will matter as to, you know, whether these curation problems can be
solved or if complex legal tasks will be fixable, will be doable, I mean, by AIs.
367
00:32:49,970 --> 00:32:53,481
So, I think that's one to keep an eye for.
368
00:32:54,178 --> 00:32:55,199
Super interesting.
369
00:32:55,199 --> 00:33:01,863
um What are your thoughts on how far we can get?
370
00:33:01,863 --> 00:33:17,792
And I don't know down what path, uh you know, whether that's AGI or ASI or whatever's next
after that with the current kind of LLM transformer architecture.
371
00:33:17,792 --> 00:33:21,574
It seems to me like they're, it's not going to
372
00:33:21,612 --> 00:33:23,284
This isn't the end state.
373
00:33:23,284 --> 00:33:27,767
This is an intermediate state to whatever's next.
374
00:33:27,767 --> 00:33:30,230
And I have no idea what that might look like.
375
00:33:30,230 --> 00:33:38,797
But there are just some shortcomings with this particular approach that we've managed to
have really good workarounds.
376
00:33:38,797 --> 00:33:49,606
Like uh maybe a year ago, um I would sit here and tell you that LLMs can't reason, that
they don't understand
377
00:33:50,243 --> 00:33:53,205
the question or the prompt that you put in there.
378
00:33:53,205 --> 00:33:59,541
And there's been a lot of workarounds, you know, with inference time compute and, um, that
have worked around that.
379
00:33:59,541 --> 00:34:06,559
Well, I don't know that I could sit here and say that today because the output looks so
convincing, but
380
00:34:06,559 --> 00:34:07,550
I had the same thing.
381
00:34:07,550 --> 00:34:11,081
had the, and they can reach out to tools, right?
382
00:34:11,081 --> 00:34:12,611
That's also something we've given them.
383
00:34:12,611 --> 00:34:15,492
There's an ability that they can call out to other tools.
384
00:34:15,492 --> 00:34:18,483
For instance, they were never very good at doing maths.
385
00:34:18,483 --> 00:34:20,884
Simple calculations couldn't be done.
386
00:34:20,884 --> 00:34:26,345
Probably also related to this entire representation problem, em but they couldn't do it.
387
00:34:26,345 --> 00:34:31,747
And then now they could just reach out to a calculator and do the calculations and pull
the results back and use it.
388
00:34:31,747 --> 00:34:31,967
Right?
389
00:34:31,967 --> 00:34:32,707
So.
390
00:34:32,899 --> 00:34:34,439
All right, but that wasn't your problem.
391
00:34:34,439 --> 00:34:39,979
The problem is can LLMs fundamentally do semantic reasoning tasks, right?
392
00:34:39,979 --> 00:34:42,699
And take that very far.
393
00:34:42,959 --> 00:34:48,799
I think that is one of the best questions to ask and also one of the hardest ones to
answer.
394
00:34:49,059 --> 00:34:54,359
My mind is like, so I've always said, no, it can't be done.
395
00:34:54,619 --> 00:34:57,859
it's a, LLMs are curve fitting.
396
00:34:58,119 --> 00:35:02,719
So they see a lot of, they've seen a lot of data on the internet and they fit the curve.
397
00:35:03,303 --> 00:35:03,923
on that.
398
00:35:03,923 --> 00:35:08,164
So the curve fitting is only as good as the data it has seen.
399
00:35:08,164 --> 00:35:12,346
And the only thing they can come up with is something that is somewhere on that curve.
400
00:35:12,346 --> 00:35:14,506
So they can't think out of that box.
401
00:35:14,586 --> 00:35:29,620
And we as humans, I think, prove that we don't have to see, just to go a bit further on
that, they might still amaze us every day because they come up with an answer that amazes
402
00:35:29,620 --> 00:35:32,531
us that we wouldn't know, for instance.
403
00:35:33,325 --> 00:35:39,488
But if you would have seen all the data that they've seen, maybe that answer doesn't
actually amaze you, right?
404
00:35:39,488 --> 00:35:42,499
So their answers are always interpolations.
405
00:35:42,499 --> 00:35:45,000
They can't come up with something novel.
406
00:35:45,120 --> 00:35:48,361
And we seem to be, as humans, be able to come up with something novel.
407
00:35:48,361 --> 00:35:50,362
That's at least what it seems like.
408
00:35:50,582 --> 00:35:56,025
But we also have a much bigger brain capacity than the LLMs have.
409
00:35:56,025 --> 00:36:02,091
It's hard to estimate, but it's definitely more than a factor of 1,000, maybe a factor of
10,000.
410
00:36:02,243 --> 00:36:07,064
uh more complex than our brain is and the LLM brain is.
411
00:36:07,384 --> 00:36:17,507
But it seems that sticking to my point is I don't think LLMs can fundamentally do
reasoning indefinitely.
412
00:36:17,507 --> 00:36:22,989
They can pretend to do it with all the data we've seen, but they can't actually think
outside of the box.
413
00:36:22,989 --> 00:36:26,510
Not until this curve fitting problem is solved.
414
00:36:26,510 --> 00:36:27,560
But that can be solved.
415
00:36:27,560 --> 00:36:31,671
There's other algorithms like genetic algorithms.
416
00:36:31,969 --> 00:36:33,780
which do not have that constraint.
417
00:36:33,780 --> 00:36:47,248
So maybe a combination or change of the architectures, bringing in some genetic evolution
into it, genetic algorithms into it, or some other technology might bring us to that next
418
00:36:47,248 --> 00:36:49,849
level that we need.
419
00:36:50,190 --> 00:37:00,175
I definitely think that this, and that's not a very original thing to say, but the worst
LLMs or the worst AIs we've seen are the ones that we see today.
420
00:37:00,461 --> 00:37:01,412
They do evolve.
421
00:37:01,412 --> 00:37:04,334
do think we'll need a scientific incremental step.
422
00:37:04,334 --> 00:37:06,106
It's not just going to be the same technology.
423
00:37:06,106 --> 00:37:09,818
We'll need a new step to really get to AGI.
424
00:37:09,859 --> 00:37:13,002
But that doesn't mean that's not useful with the state it is.
425
00:37:13,002 --> 00:37:24,431
So we've all really realized, I think, that you can give it a document and can summarize
it really well or answer questions on that document really well or use snippets of it and
426
00:37:24,431 --> 00:37:27,534
then compose something new or compose an email.
427
00:37:27,534 --> 00:37:29,101
So there's definitely...
428
00:37:29,101 --> 00:37:34,869
within the constraints it has, there's definitely value we can get out of it.
429
00:37:35,362 --> 00:37:49,028
Yeah, I think to take it to that next level, to start curing diseases and uh really making
breakthroughs in science, the ability to come up with novel concepts, um like you said, it
430
00:37:49,468 --> 00:37:58,732
can reassemble the existing Lego blocks it's been trained on to create new structures of
information.
431
00:37:58,732 --> 00:38:03,064
But in terms of something outside of that universe of
432
00:38:03,064 --> 00:38:05,196
things that seem like a mathematical proof.
433
00:38:05,196 --> 00:38:12,051
Finding a novel approach, I was a math major undergrad, so, and I struggled in advanced
calculus with proofs.
434
00:38:12,051 --> 00:38:14,263
That was a very humbling experience for me.
435
00:38:14,263 --> 00:38:32,118
um I'm great at, you know, um differential equations, matrix theory, all the stuff where
you're solving equations, but proofs require such a abstract lens and thinking that
436
00:38:32,334 --> 00:38:36,483
um And I don't think LLMs are ever gonna get there.
437
00:38:36,483 --> 00:38:40,043
That limitation is embedded in their design, correct?
438
00:38:40,043 --> 00:38:40,943
That's correct.
439
00:38:40,943 --> 00:38:42,564
Yeah, I think that's correct.
440
00:38:43,124 --> 00:38:48,225
By the way, I start to seem like the benchmark guy, but another interesting one.
441
00:38:48,225 --> 00:39:00,368
So there's all these bar exam benchmarks, but they test for a lot of knowledge that the LM
might have seen on the internet.
442
00:39:00,549 --> 00:39:08,251
There's Francois Cholet, uh he's the person who started the em Keras library, I think.
443
00:39:08,251 --> 00:39:09,971
So one of the AI
444
00:39:10,019 --> 00:39:15,431
programming, the libraries you would use if you would be writing a low level AI code in
Python.
445
00:39:15,431 --> 00:39:18,943
He was the original author of that product.
446
00:39:18,943 --> 00:39:26,726
And he's also very verbose about and explicit about the fact that he doesn't believe that
the LLMs will take us there.
447
00:39:26,726 --> 00:39:30,168
And to prove it, he's created this Arc challenge.
448
00:39:30,168 --> 00:39:36,710
And the Arc contest is a benchmark, but with challenges that the LLM definitely hasn't
seen.
449
00:39:36,710 --> 00:39:38,371
So they come up with...
450
00:39:38,403 --> 00:39:44,763
challenges, which are very abstract visual challenges that as a human are super simple to
solve.
451
00:39:44,763 --> 00:39:49,483
Like we'll nail 99 % of them without an issue.
452
00:39:49,483 --> 00:39:57,423
But the LLM score maybe 2, 3 % on the current ARC2 benchmark.
453
00:39:57,463 --> 00:40:06,183
So he thinks that that's a true benchmark for novel thinking, maybe for the path towards
general intelligence.
454
00:40:07,127 --> 00:40:17,422
And that's also an interesting too, and definitely an interesting person to listen to and
to interview if you would ever be able to get him on the podcast.
455
00:40:17,612 --> 00:40:20,864
Yeah, no, that sounds super interesting.
456
00:40:20,864 --> 00:40:26,368
Yeah, I love getting uh detailed with this.
457
00:40:26,368 --> 00:40:27,819
I've had guests on the show.
458
00:40:27,819 --> 00:40:33,963
In fact, had a uh colleague, or former colleague of yours, Jack Shepard, um on the
podcast.
459
00:40:33,963 --> 00:40:37,776
And we were talking about the legal reasoning question.
460
00:40:37,776 --> 00:40:44,020
Or I'm sorry, not legal reasoning, just LLM's ability to reason and whether or not they
truly comprehend.
461
00:40:44,141 --> 00:40:46,582
And his comment was, it's
462
00:40:46,582 --> 00:40:48,144
It doesn't really matter.
463
00:40:48,144 --> 00:40:50,485
And, um, this is about a year ago.
464
00:40:50,485 --> 00:40:56,931
So this is before, um, three and these reasoning models came on the scene.
465
00:40:56,931 --> 00:41:10,523
And my, my, my rebuttal to that was, well, I think it, it, does matter in understanding
these limitations and because that helps influence how you apply the technology to a
466
00:41:10,523 --> 00:41:11,984
problem domain.
467
00:41:12,165 --> 00:41:12,465
Right.
468
00:41:12,465 --> 00:41:13,666
If you know,
469
00:41:13,698 --> 00:41:30,082
that it truly can't reason and come up with novel ideas, you're going to be better
equipped to um deploy it in a way that's going to lead to success.
470
00:41:30,082 --> 00:41:34,784
I um think it is important for us to understand these limitations.
471
00:41:34,784 --> 00:41:39,685
And also, I'm not a lawyer, but I've had many lawyers on this show.
472
00:41:39,685 --> 00:41:43,366
And they all consistently say that, uh
473
00:41:43,424 --> 00:41:47,417
Legal is a past looking discipline, right?
474
00:41:47,417 --> 00:41:48,048
Everything.
475
00:41:48,048 --> 00:41:51,701
So it doesn't really have to come up with new.
476
00:41:51,861 --> 00:41:53,088
Now there are, are.
477
00:41:53,088 --> 00:42:08,415
So when, um, Marty Lipton came up with the poison pill concept in the 1980s as a mechanism
to deter hostile takeovers, um, that was a new approach.
478
00:42:08,415 --> 00:42:11,874
Could, could an LLM piece that together?
479
00:42:11,874 --> 00:42:13,275
That's a good question.
480
00:42:13,615 --> 00:42:24,723
I don't know because it was, he used existing mechanisms to create, you know, that
probably would exist in an LLM's dataset training data.
481
00:42:24,723 --> 00:42:29,986
So could an LLM come up with a new poison pill um approach?
482
00:42:30,467 --> 00:42:40,095
Well, eh it comes up with interesting ideas, So fundamentally, I think it can't come up
with a truly novel idea.
483
00:42:40,356 --> 00:42:45,840
the mathematical proof is the perfect example of where it actually completely falls true.
484
00:42:46,461 --> 00:42:54,728
It kind of depends a little bit on what it means to assemble something together and how
much novelty there is truly in that poison pill.
485
00:42:54,728 --> 00:42:58,211
And I don't really know that well enough to...
486
00:42:58,211 --> 00:42:58,733
m
487
00:42:58,733 --> 00:43:07,050
don't know all the prior examples of that to make a good prediction whether that's
possible.
488
00:43:07,050 --> 00:43:19,300
I guess another thing is if you come up with a complex problem and you want it to plan out
what it should be doing, so we've got this agent technologies, it's not always great at
489
00:43:19,300 --> 00:43:25,666
making the plan and then following through on the plan and definitely not good at seeing
where its plan goes wrong.
490
00:43:25,666 --> 00:43:27,887
I think that's part of this.
491
00:43:27,917 --> 00:43:31,809
this incapacity to truly, truly grasp what's going on.
492
00:43:31,809 --> 00:43:32,059
Right.
493
00:43:32,059 --> 00:43:39,692
So if it's more than just a string manipulation, which is going on, you kind of lose a
certain meaning to it.
494
00:43:39,692 --> 00:43:43,333
Having said that we've been proven over and over wrong.
495
00:43:43,333 --> 00:43:49,196
And we see more and more examples of more complex reasoning being done by LMS.
496
00:43:49,196 --> 00:43:49,406
Right.
497
00:43:49,406 --> 00:43:53,697
So, and it's interesting, this is all empirical.
498
00:43:54,738 --> 00:43:57,439
Contrary to the software algorithms that you wrote in
499
00:43:57,439 --> 00:44:01,940
In basic, somebody could just go in and figure out what was that doing here, right?
500
00:44:01,940 --> 00:44:04,202
And see why it can't or can't do it.
501
00:44:04,202 --> 00:44:06,783
This is not the case for the LLMs.
502
00:44:06,783 --> 00:44:12,045
We really have to empirically test them as if they're a black box and see if, you know.
503
00:44:12,045 --> 00:44:19,529
So even the greatest minds, the biggest experts, if you ask Jan Le Koon or Hinton, they
will have different opinions.
504
00:44:19,529 --> 00:44:22,330
And, you you would think these guys will probably just see it.
505
00:44:22,330 --> 00:44:25,691
They know the technology in and out, but it's not that simple.
506
00:44:25,730 --> 00:44:26,090
Yeah.
507
00:44:26,090 --> 00:44:29,492
And they all have wildly different assessments.
508
00:44:29,492 --> 00:44:36,315
I Jan is, I would say, the most bearish, uh skeptical.
509
00:44:36,315 --> 00:44:46,470
um I think he likes making press and press-worthy statements, you know, that AI is not
even as smart as a house cat.
510
00:44:46,470 --> 00:44:49,461
you know, those things create headlines, and that gets him attention.
511
00:44:49,461 --> 00:44:50,561
And I think he likes that.
512
00:44:50,561 --> 00:44:52,364
um But...
513
00:44:52,364 --> 00:44:53,404
I know we're almost out of time.
514
00:44:53,404 --> 00:44:58,116
I have a final question for you though, um which I think is a really important one for our
listeners.
515
00:44:58,116 --> 00:45:03,979
So we cater primarily to like knowledge management, innovation professionals and large law
firms.
516
00:45:03,979 --> 00:45:17,214
And I'm wondering where, what is, where does the future lie in knowledge management, you
know, which is the discipline where you kind of curate and, you know, identify and create
517
00:45:17,214 --> 00:45:21,698
and maintain repositories of model or precedent documents.
518
00:45:21,698 --> 00:45:30,557
that are those examples, it kind of reminded me of what you talked about, the rules-based
approach to language translation.
519
00:45:30,557 --> 00:45:37,683
And will we get to a place where the technology can do that?
520
00:45:37,683 --> 00:45:41,239
What are your thoughts on that?
521
00:45:41,239 --> 00:45:49,001
Yeah, I mean, we've touched on that slightly before, But I think we are not there at the
moment.
522
00:45:49,001 --> 00:45:57,364
There's not even a forecast, like an outlook that that's going to be the case that, you
you could just train a model and have that job handled.
523
00:45:57,364 --> 00:46:02,245
So I would say let's now be very realistic and know the current limitations.
524
00:46:02,245 --> 00:46:03,515
Same message, right?
525
00:46:03,515 --> 00:46:05,526
Find the applications that work.
526
00:46:05,886 --> 00:46:08,887
The knowledge industry can definitely benefit from AI.
527
00:46:08,887 --> 00:46:10,261
I mean, it's just...
528
00:46:10,261 --> 00:46:19,358
undoubtedly, There's probably still some discovery going on about what it can do and how
far it can do it reliably, but it can do it right now.
529
00:46:19,638 --> 00:46:25,102
Now that outlook, that horizon, where we'll be moving towards, will it be possible?
530
00:46:25,102 --> 00:46:28,885
My personal hunch is that yes, it will be.
531
00:46:28,885 --> 00:46:36,891
I've seen too many examples of connectionist models seeing the
532
00:46:36,931 --> 00:46:41,852
I guess the forest through the trees and figuring it out at some point at a level of
complexity.
533
00:46:41,852 --> 00:46:44,173
I don't see why that wouldn't be the case.
534
00:46:45,593 --> 00:46:53,035
hardest thing will be to figure out what the timeline is for that and the complexity of
the models and the cost associated to running them.
535
00:46:53,035 --> 00:46:55,816
Now, interestingly enough, we have, I think, upper limit, right?
536
00:46:55,816 --> 00:46:59,007
Our brain is embedded in this physical world.
537
00:46:59,007 --> 00:47:00,797
It is computer.
538
00:47:00,837 --> 00:47:04,158
It's pretty cheap to run in terms of energy capacity.
539
00:47:04,158 --> 00:47:06,939
em So there is definitely...
540
00:47:07,437 --> 00:47:17,099
we should at some point achieve something that, I mean, that's the upper limit that we,
the upper limit, that is a limit of, lower limit of the costs that we should achieve at
541
00:47:17,099 --> 00:47:18,601
some point.
542
00:47:18,942 --> 00:47:21,805
I'm bullish on that being the case.
543
00:47:21,805 --> 00:47:26,234
I just don't know when, if that's not too vague of an answer.
544
00:47:26,234 --> 00:47:26,794
I get it.
545
00:47:26,794 --> 00:47:35,811
And then, you know, um I'm very bullish on knowledge management's need, at least in the
near to midterm.
546
00:47:35,811 --> 00:47:37,352
It's more than ever.
547
00:47:37,352 --> 00:47:43,977
Like, as we transition out of this billable hour model, which we're going to, uh we're
going to go kicking and screaming.
548
00:47:43,977 --> 00:47:44,757
it's
549
00:47:45,312 --> 00:47:48,255
it will still play a role in how things get priced.
550
00:47:48,255 --> 00:47:56,574
But at the end of the day, I don't think customers are going to pay for time like they
used to given these new technology advancements.
551
00:47:56,574 --> 00:48:02,740
I think that puts uh knowledge management in a position where they can really drive bottom
line performance.
552
00:48:02,740 --> 00:48:08,979
um And that's going to be really important to the business.
553
00:48:08,979 --> 00:48:18,159
think we'll see a lot of potential of automation that's driven by access to good knowledge
assets.
554
00:48:18,159 --> 00:48:32,059
So you'll get great automation on starting from a knowledge asset, finding some additional
inputs and getting to a close to an output product as long as you have a clear sight on
555
00:48:32,059 --> 00:48:34,719
what those good assets are.
556
00:48:34,719 --> 00:48:37,019
I'm with you.
557
00:48:37,207 --> 00:48:38,609
Put the investment there now.
558
00:48:38,609 --> 00:48:44,234
Put the investment in finding the information, enriching them, searching the search
technology to find them.
559
00:48:44,295 --> 00:48:50,962
And then I would say experiment with AI to see what automation you can drive on top of
that in the actual legal flow.
560
00:48:51,800 --> 00:48:52,630
Yeah.
561
00:48:52,871 --> 00:48:55,565
Well, this has been a fantastic conversation.
562
00:48:55,565 --> 00:48:57,157
I've really enjoyed it.
563
00:48:57,157 --> 00:49:02,342
And em I appreciate you spending a few minutes with us here today.
564
00:49:04,025 --> 00:49:05,046
Yeah.
565
00:49:05,407 --> 00:49:08,009
Are you going to be at Ilticon this year?
566
00:49:08,117 --> 00:49:09,870
I will not be at Elta.com.
567
00:49:09,870 --> 00:49:11,452
I'm on holiday.
568
00:49:11,452 --> 00:49:19,113
I regret that now, but I'll find some opportunity to meet you in real life so we can
continue this conversation.
569
00:49:19,162 --> 00:49:20,384
absolutely.
570
00:49:20,384 --> 00:49:21,326
OK, great.
571
00:49:21,326 --> 00:49:24,873
Well, thanks again, and we'll catch up soon.
572
00:49:25,575 --> 00:49:27,178
All right, thanks, John.
00:00:06,485
Jan, how are you this afternoon, or I guess this evening, your time?
2
00:00:06,485 --> 00:00:07,401
It's evening.
3
00:00:07,401 --> 00:00:07,924
Good.
4
00:00:07,924 --> 00:00:08,556
Thanks, Dad.
5
00:00:08,556 --> 00:00:09,710
Thanks for having me.
6
00:00:09,710 --> 00:00:11,630
Yeah, I'm excited about the conversation.
7
00:00:11,630 --> 00:00:17,730
We've been trying to get this scheduled for a while, so I'm glad we're actually making it
happen.
8
00:00:18,830 --> 00:00:23,450
Why don't we get you introduced for the folks that don't know you?
9
00:00:23,850 --> 00:00:26,670
You've been around for quite some time.
10
00:00:26,670 --> 00:00:27,850
You're now at iManage.
11
00:00:27,850 --> 00:00:29,570
You were formerly at Raven.
12
00:00:29,650 --> 00:00:32,230
You were even all the way back in the autonomy days.
13
00:00:32,230 --> 00:00:36,870
But why you tell everybody about your background and what you're up to today?
14
00:00:36,919 --> 00:00:45,355
I guess if I go way back then, I studied as an engineer em and specifically did AI at uni
and that's quite a while ago.
15
00:00:45,355 --> 00:00:47,426
That was my second one.
16
00:00:47,426 --> 00:00:50,368
did chip design, hardware design first.
17
00:00:50,368 --> 00:00:59,985
Then I moved into AI research with the Steel company and that's where I decided that I
should probably leave Steel behind.
18
00:00:59,985 --> 00:01:03,036
Still, I regret that it was an amazing company to work for.
19
00:01:03,259 --> 00:01:05,098
I joined autonomy.
20
00:01:05,118 --> 00:01:06,433
That's exactly it.
21
00:01:06,433 --> 00:01:07,263
You're right there.
22
00:01:07,263 --> 00:01:09,435
So worked in enterprise search for quite a while.
23
00:01:09,435 --> 00:01:14,639
And then we decided with a couple of us to leave autonomy behind and, and start Raven.
24
00:01:14,639 --> 00:01:18,491
So I was one of the co-founders of Raven was CTO there for seven years.
25
00:01:18,671 --> 00:01:22,103
And we got acquired by IMAGE in 2017.
26
00:01:22,694 --> 00:01:30,589
And I stayed in engineering positions and now VP of product management to still product
positions for all my term at IMAGE.
27
00:01:30,589 --> 00:01:33,541
It's also been seven years by the way.
28
00:01:33,541 --> 00:01:36,263
And yeah, main mission has been to.
29
00:01:36,951 --> 00:01:43,826
build out an AI team, bring AI to the cloud and get it embedded into the products of the
image portfolio.
30
00:01:43,826 --> 00:01:45,177
That's really been my role.
31
00:01:45,177 --> 00:01:47,078
em Yeah.
32
00:01:47,078 --> 00:01:54,904
And I guess just to maybe like summarize it, I guess I've been wearing an engineering hat
for most of my career.
33
00:01:54,904 --> 00:02:00,088
So as an engineer, I look at what is at our disposal in technology and what can we do with
it.
34
00:02:00,088 --> 00:02:03,440
But also I've got this kind of science hat, right?
35
00:02:03,440 --> 00:02:05,161
And the science hat is more about.
36
00:02:05,357 --> 00:02:07,210
Where are we moving towards in the longer future?
37
00:02:07,210 --> 00:02:08,461
Where is this trending?
38
00:02:08,461 --> 00:02:11,507
And the timeframes are slightly different.
39
00:02:11,507 --> 00:02:15,261
I think it's months and a couple of years for engineering.
40
00:02:15,261 --> 00:02:19,937
It's more of longer, many years for us as scientists that I look at things.
41
00:02:20,406 --> 00:02:20,887
Interesting.
42
00:02:20,887 --> 00:02:23,380
Well, you were way ahead of the curve on AI.
43
00:02:23,380 --> 00:02:33,131
What was it that drove you in that direction, you know, so early on when AI was still kind
of somewhat niche?
44
00:02:33,195 --> 00:02:34,195
Yeah, it was.
45
00:02:34,195 --> 00:02:39,087
mean, it's definitely pre all the connectionist's model as we, as we call it, right?
46
00:02:39,087 --> 00:02:40,737
The connections is the neural network.
47
00:02:40,737 --> 00:02:44,438
So when I got into it, was before that time.
48
00:02:44,538 --> 00:02:47,359
was just this, just this.
49
00:02:47,999 --> 00:02:56,351
fact that on the one hand, intelligence and consciousness is something that really
interests me a lot.
50
00:02:56,351 --> 00:02:59,952
in the, you know, the fact that it just emerges into the world.
51
00:02:59,952 --> 00:03:03,223
And then secondly, that there's this field of IT, which is
52
00:03:03,299 --> 00:03:04,679
pursuing this, right?
53
00:03:04,679 --> 00:03:12,719
It's on the one hand, trying to investigate and explain what our intelligence is all about
and our reasoning processes are all about.
54
00:03:12,719 --> 00:03:22,579
On the other hand, it's also bringing these technologies then to the field of our
practical applications, embedding it into products and making things happen with it.
55
00:03:22,639 --> 00:03:31,489
And this fact that you could make machines behave in a semi or seemingly intelligent way
is something that I always like.
56
00:03:31,489 --> 00:03:34,971
That's why I picked up the study and I've always stuck with it.
57
00:03:35,360 --> 00:03:38,530
And when did you actually get involved into the field?
58
00:03:38,530 --> 00:03:39,733
Like what year?
59
00:03:42,614 --> 00:03:45,095
2001 I think is when I graduated.
60
00:03:45,315 --> 00:03:47,739
it's been a while.
61
00:03:47,886 --> 00:03:55,283
Yeah, I mean, that was so Watson on Jeopardy was in the 90s, right?
62
00:03:55,283 --> 00:04:01,805
Yeah, and we had the chess computer before, they were just deep search models, right, as
you call it.
63
00:04:01,865 --> 00:04:10,867
And then we had the, specialty was support vector machines, which kind of went out of
fashion as neural networks stepped in.
64
00:04:11,147 --> 00:04:19,830
And I worked on trying to do, for instance, corrosion detection, the type of corrosion on
steel plates, because it was a steel company, right?
65
00:04:19,830 --> 00:04:24,171
And so we kind of had a guy who
66
00:04:24,171 --> 00:04:29,653
He evaluated steel plates by looking at it and said like, it's 10 % corroded by this type
of corrosion.
67
00:04:29,653 --> 00:04:37,256
And then we built training sets and SVM to train on them and to completely make his job
redundant.
68
00:04:37,256 --> 00:04:41,418
He liked it because he, I mean, he liked being made redundant for that full task.
69
00:04:41,418 --> 00:04:44,819
That was not the joy of his day, let's say.
70
00:04:44,942 --> 00:04:56,622
Yeah, well, yeah, I mean, so I paid attention during those early years when I started my
technology journey very early, fifth grade.
71
00:04:57,002 --> 00:04:59,722
So this would have been 1982.
72
00:04:59,882 --> 00:05:08,262
got a Texas Instruments 994A personal computer, an extended basic cartridge, and a book
about
73
00:05:08,312 --> 00:05:12,764
two and a half inches thick that just had all the syntax of the different commands.
74
00:05:12,764 --> 00:05:19,106
And I mean, I was 10 years old and I was totally geeking out on this and building little
programs.
75
00:05:19,106 --> 00:05:24,289
I remember I built an asteroid program where basically the asteroids didn't move.
76
00:05:24,289 --> 00:05:29,856
I wasn't that sophisticated, but you could navigate a little spaceship across the static
asteroid field.
77
00:05:29,856 --> 00:05:37,974
But you know, I 10 years old and then I got out of it in high school because chicks don't
want to talk to
78
00:05:38,286 --> 00:05:54,546
guys so I stepped away and then found it again back after college when the you know so
many things had changed so much but you know AI really kind of hit my radar it was the
79
00:05:54,546 --> 00:06:07,430
AlphaGo you know that was like the moment like wow but you know since then I've been you
know chat GPT
80
00:06:07,490 --> 00:06:09,822
and oh all these new capabilities.
81
00:06:09,822 --> 00:06:12,515
I'm spending a lot of time there.
82
00:06:12,515 --> 00:06:18,740
And I'm finding a lot of amazing efficiencies.
83
00:06:18,781 --> 00:06:26,087
You saw the agenda that I put together for us that was an output of we had a conversation
on a planning call.
84
00:06:26,087 --> 00:06:33,642
I took the transcript, it into a custom-clawed project with examples in its training
materials and custom instructions.
85
00:06:33,642 --> 00:06:39,186
and that used to take me, I used to have to go back and listen to the recording again and
take notes.
86
00:06:39,307 --> 00:06:45,452
So it would be a 30 minutes on the call, then another 30 minutes at least to listen and
get all the details.
87
00:06:45,452 --> 00:06:48,334
And now it takes me about three minutes.
88
00:06:48,669 --> 00:06:58,457
So these, mean, coming to this topic of the efficiencies, I actually went out and looked a
little bit because like one of the things I've been fascinated about is how does like a
89
00:06:58,457 --> 00:07:03,341
knowledge industry like legal compared to other knowledge industries, for instance,
engineering, right?
90
00:07:03,341 --> 00:07:12,539
So how do they, why is it then the engineers treat themselves to better tools sometimes
than the legal workers to make their life easier?
91
00:07:12,539 --> 00:07:17,653
So I started looking for data to back this up specifically then in the AI land.
92
00:07:17,795 --> 00:07:21,806
So I found this study was done by GitHub and it's on their own product, right?
93
00:07:21,806 --> 00:07:27,968
On copilot, GitHub copilot, which is probably not the thing you just take as a scientific
research paper, right?
94
00:07:27,968 --> 00:07:29,688
Because it's on their own stuff.
95
00:07:29,688 --> 00:07:42,462
But they did say that when they rolled it out to an organization that they have like 95 %
adoption on the same day by every user, practically every user starts using it.
96
00:07:42,582 --> 00:07:46,383
And then they get to what does it actually help them with?
97
00:07:46,903 --> 00:07:52,205
they claimed that it was a 55 % time saved on coding tasks.
98
00:07:52,986 --> 00:07:57,988
But I don't know if that's actually backed by real data or it was the perception of the
people.
99
00:07:57,988 --> 00:08:01,869
And one of the metrics I track is published by METER.
100
00:08:01,869 --> 00:08:14,755
I don't know if you know METER, but METER just published a report a couple of days ago on
how AI helps open source developers in there, how it speeds them up and how much they
101
00:08:14,755 --> 00:08:15,459
think in...
102
00:08:15,459 --> 00:08:18,279
advance it will speed them up and then how much it actually did.
103
00:08:18,279 --> 00:08:30,499
What they found is that, but they think about, they hope for 20%, 30 % speed up, but they
suffer from a 12 % slowdown when using AI, which kind of really baffled me.
104
00:08:30,499 --> 00:08:34,579
That's very contradictory to what the Copilot people were saying.
105
00:08:34,979 --> 00:08:44,621
Maybe the most interesting one was that, and that one I believe, is that from the IT
developers who use an AI assistant encoding is that 90 %
106
00:08:44,621 --> 00:08:47,833
felt more fulfilled in their job.
107
00:08:47,954 --> 00:09:00,414
And that's, know, if anything else, that is something that I would be interested in,
especially because TR did some survey and they found that the number one thing that legal
108
00:09:00,414 --> 00:09:02,866
workers want to improve is their work-life balance.
109
00:09:02,866 --> 00:09:07,990
So if fulfillment is something that can bring them and make them happier, then at least
it's that.
110
00:09:08,951 --> 00:09:13,515
But yeah, I think it's been slower in the uptake and legal, but it's also not happening.
111
00:09:13,515 --> 00:09:14,381
Maybe...
112
00:09:14,381 --> 00:09:26,073
three, five, three years ago, definitely in the Raven days, we could claim like, there's
always the skepticism and lack of trust and I think that's with the, know, Chat GPTs and
113
00:09:26,073 --> 00:09:30,057
the LLMs that has changed or is changing and has already changed.
114
00:09:30,722 --> 00:09:38,464
Yeah, know, uh Ethan Malik talks a lot about kind of the jagged edge of AI in terms of
capabilities.
115
00:09:38,464 --> 00:09:44,306
And, you know, I noticed that, so my coding skills are largely out of date other than SQL.
116
00:09:44,306 --> 00:09:49,347
um I was on the SQL team at Microsoft many years ago and SQL hasn't changed much.
117
00:09:49,347 --> 00:09:59,530
um So um I'm able to still do some things in there and I do from time to time, you know,
analyze data and whatnot.
118
00:09:59,530 --> 00:10:10,275
And I have noticed a very um high degree of variation in terms of even from really good
models like Claude on for coding.
119
00:10:10,275 --> 00:10:22,620
Like just yesterday, I tried to, uh downloaded a little freeware app called Auto Hotkey
and, you know, trying to be more efficient m and a common snippets.
120
00:10:22,620 --> 00:10:28,122
would, and I had, I had Claude write me a script and it took me like,
121
00:10:28,686 --> 00:10:32,126
It took me like five times to iterate through it for it to get it right.
122
00:10:32,126 --> 00:10:40,086
You know, the first time it did it on the previous version of Auto Hotkey, you know, you
didn't, and now the syntax is a little different.
123
00:10:40,106 --> 00:10:49,486
Then it, you know, I was basically having it control, pay a control V, uh, paste into an
app and it would only paste part of the string.
124
00:10:49,486 --> 00:10:50,986
And then I had to ask it why.
125
00:10:50,986 --> 00:10:57,998
And then it, you know, I basically had to put a little timer delay in there to get it to
pace the full string before it.
126
00:10:57,998 --> 00:11:00,078
terminated the thread, I guess.
127
00:11:00,498 --> 00:11:07,478
then on other scenarios like SQL, if I have, let's say, a little access database, I'll
pull some data down.
128
00:11:07,478 --> 00:11:22,418
If I don't want to mess with SQL, and I'll export the database schema into PDFs, upload it
into an LLM, and ask it to write a query that will require me to go search for syntax,
129
00:11:22,418 --> 00:11:27,896
like a correlated subquery or something that I'm not doing.
130
00:11:27,896 --> 00:11:30,733
frequently and it usually nails it.
131
00:11:31,086 --> 00:11:35,680
I think it's there's that jagged edge concept is real.
132
00:11:35,757 --> 00:11:43,600
mean, some of these shortcomings, let's say, are then picked up, picked on and joked
about.
133
00:11:43,600 --> 00:11:46,882
Like we had this, I don't know if you remember this, strawberry.
134
00:11:46,882 --> 00:11:53,015
Yeah, so why can't they tell me how many Rs are there in the word strawberry?
135
00:11:53,015 --> 00:12:02,099
But then if you actually dig deeper, what happens under the hood is the model never sees
the word strawberry.
136
00:12:02,679 --> 00:12:09,133
You know, what happens is there's a tokenizer and the tokenizer splits the words into
individual subparts.
137
00:12:09,133 --> 00:12:17,459
then though each of those might be straw and berry or bear and re or it might be just one
token, you you don't really know.
138
00:12:17,459 --> 00:12:23,402
But the key thing is that it then converts that into like a numerical vector.
139
00:12:23,402 --> 00:12:25,494
And that's really what the model reasons with.
140
00:12:25,494 --> 00:12:27,957
So for all it.
141
00:12:27,957 --> 00:12:31,488
knows it could be strawberry written in French, which is phrase.
142
00:12:31,488 --> 00:12:34,529
mean, it would be the same vector at sea.
143
00:12:34,529 --> 00:12:39,380
because it never has access to that something we see, which is the word, it couldn't
answer that question.
144
00:12:39,380 --> 00:12:46,332
It could just like probably just look in its memory of things it's seen that is close and
then just try to make an educated guess.
145
00:12:46,332 --> 00:12:48,833
So there's explanations.
146
00:12:48,833 --> 00:12:55,154
And then once you know the explanation, you can work towards solving them as well, of
course.
147
00:12:56,495 --> 00:12:57,235
I guess
148
00:12:57,235 --> 00:13:05,633
One I don't want to distract too much, but one that really fascinates me is the alignment
problem.
149
00:13:05,633 --> 00:13:12,929
And alignment kind of comes down to these LLMs are really very rough gems.
150
00:13:13,530 --> 00:13:16,472
They're language prediction machines.
151
00:13:16,472 --> 00:13:20,015
They've seen a lot of text, like all the text is actually on the internet.
152
00:13:20,015 --> 00:13:24,359
And then what we give them is some input and...
153
00:13:24,705 --> 00:13:27,997
the model needs to complete whatever we've given them.
154
00:13:28,378 --> 00:13:38,046
But, and the way that these big vendors make them do something that's actually valuable to
them is by a second training step, this reinforcement learning.
155
00:13:38,046 --> 00:13:42,870
The one that actually AlphaGo, you know, that's where AlphaGo became famous for the...
156
00:13:42,870 --> 00:13:45,832
So there's this two-phase training process.
157
00:13:45,832 --> 00:13:54,179
On the one hand, these LLMs consume all the text and they have to predict the next word,
just like, you know, the cell phone next word prediction thing works.
158
00:13:54,179 --> 00:14:05,234
And then secondly, to teach them about values or the goals that they should achieve, they
get this reinforcement, the learning.
159
00:14:05,234 --> 00:14:07,985
the reinforcement is kind of like a carrot and a whip.
160
00:14:07,985 --> 00:14:11,336
Like when they get the right answer, then they get a carrot.
161
00:14:11,336 --> 00:14:14,458
And if they don't get the right answer, they get whipped by some human being.
162
00:14:14,458 --> 00:14:16,288
That's essentially what happens, right?
163
00:14:16,789 --> 00:14:21,710
And that's how they get shaped into making sure that they do something useful for us.
164
00:14:22,689 --> 00:14:25,080
And Tropic has looked into that quite a bit.
165
00:14:25,080 --> 00:14:34,722
And what is really fascinating is that it gets, you know, the bigger the model becomes and
the, guess you could say the smarter it becomes, the harder it is to get them aligned with
166
00:14:34,722 --> 00:14:36,143
what we want them to do.
167
00:14:36,143 --> 00:14:39,233
They really try to uh cheat us, right?
168
00:14:39,233 --> 00:14:41,924
That's, they see exactly.
169
00:14:41,924 --> 00:14:44,645
They try, they talk very nice to us.
170
00:14:44,645 --> 00:14:46,766
They, they think like we're the best.
171
00:14:46,766 --> 00:14:52,557
That's, know, and they, but more importantly, I guess more scientifically is if you give
them a coding test.
172
00:14:52,557 --> 00:14:54,238
they tried to take shortcuts.
173
00:14:54,238 --> 00:14:56,888
They don't necessarily write a program that actually works.
174
00:14:56,888 --> 00:15:03,530
They try to write a program that satisfies the test conditions, which is not necessarily
the same thing.
175
00:15:03,830 --> 00:15:06,931
And that's where it gets really fascinating.
176
00:15:06,931 --> 00:15:11,173
You can see this human behavior slipping into them.
177
00:15:11,173 --> 00:15:19,415
And it will be a challenge to keep on, at least with this technology, to keep on making
them useful for us.
178
00:15:19,756 --> 00:15:20,566
Yeah.
179
00:15:20,566 --> 00:15:33,490
Well, you mentioned coding and like how the last time you and I spoke when we were getting
prepared for this episode, we talked about how um the kind of the contrasting approach
180
00:15:33,490 --> 00:15:44,033
between how legal professionals leverage or view AI and software engineers with tools like
GitHub Copilot.
181
00:15:44,033 --> 00:15:47,924
And there's kind of different mindsets, different approaches.
182
00:15:47,924 --> 00:15:49,154
What is your?
183
00:15:49,420 --> 00:15:50,952
What is your take on that?
184
00:15:51,843 --> 00:16:00,669
I there's definitely like a difference in adoption, the difference of adoption that has
been around for a while.
185
00:16:00,829 --> 00:16:04,171
mean, the IT and software world can't be compared to the legal world.
186
00:16:04,171 --> 00:16:14,698
If you look at, I'll just bring up an example that I've mentioned in the past, just to
illustrate how different these industries look at things as the open source movement,
187
00:16:14,698 --> 00:16:14,969
right?
188
00:16:14,969 --> 00:16:17,921
So the open source movement was a big movement.
189
00:16:17,921 --> 00:16:20,142
I guess it goes back to this sixties or seventies.
190
00:16:20,142 --> 00:16:22,013
I don't know exactly when it started.
191
00:16:22,115 --> 00:16:33,195
where some universities and even individuals and companies decided that they would just
throw all their intellectual property in the open and share it with everyone with the
192
00:16:33,195 --> 00:16:43,415
belief that that would actually fast track the entire industry and it would accelerate
them rather than, you know, give all their most valuable assets away.
193
00:16:43,415 --> 00:16:49,635
That is something that's completely unthinkable as a business concept, I think, in the
legal industry.
194
00:16:49,635 --> 00:16:51,917
While maybe it could also fast...
195
00:16:51,917 --> 00:16:54,579
track or uh accelerate or fuel the industry.
196
00:16:54,579 --> 00:16:56,510
We don't really know how that would end.
197
00:16:56,510 --> 00:17:04,175
there was definitely, Microsoft was one of the big fighters against the open source
movement because they thought it was going to ruin everything.
198
00:17:04,415 --> 00:17:06,096
It has changed, of course.
199
00:17:06,217 --> 00:17:08,628
I just wanted to take that up as an example.
200
00:17:08,628 --> 00:17:16,533
So there's definitely a change in attitude and maybe it's risk aversion and probably with
201
00:17:16,631 --> 00:17:29,810
with reason, like the output quality, the risks around data privacy and being exposed as
an individual, like that lawyer that used the 2023, that New York lawyer that wrote the
202
00:17:29,810 --> 00:17:30,701
brief.
203
00:17:30,701 --> 00:17:37,725
that, I mean, no developer really, I think has that same risk that they would get exposed
in this way.
204
00:17:37,825 --> 00:17:40,817
Software gets written and gets double checked by machines.
205
00:17:40,817 --> 00:17:43,089
And of course it has to function before it goes out.
206
00:17:43,089 --> 00:17:46,721
So there's more of a personality around there that matters.
207
00:17:46,947 --> 00:17:49,008
There's a different business model, of course, right?
208
00:17:49,008 --> 00:18:01,431
The billing, then I'm talking about law firms, the billing by the hour model that
definitely doesn't really encourage the super efficiency, which is very different for
209
00:18:01,911 --> 00:18:02,641
corporate legal.
210
00:18:02,641 --> 00:18:12,114
we, by the way, I think even with an image, we see that with our customers, that there's a
difference in attitude and uptake between corporate legal and law firms.
211
00:18:12,694 --> 00:18:15,425
Maybe it's as a personality.
212
00:18:15,567 --> 00:18:17,887
Maybe there's a knowledge gap.
213
00:18:17,887 --> 00:18:31,516
I think we've touched on the fact that there's definitely like an immediate return on
investment mentality versus engineering firms where there's more of an R &D, true R &D.
214
00:18:31,516 --> 00:18:38,981
Like let's the budget aside and let some innovation brew in that budget.
215
00:18:38,981 --> 00:18:43,875
mean, that's just engineering firms have to innovate that way.
216
00:18:43,875 --> 00:18:45,197
to be able to be future-proof.
217
00:18:45,197 --> 00:18:54,197
And I think that's a mentality not really baked into the legal industry, just because
there was never a need for it.
218
00:18:54,540 --> 00:18:54,890
Right.
219
00:18:54,890 --> 00:18:57,452
Yeah, I've written about this quite a bit.
220
00:18:57,452 --> 00:19:00,373
And that's due to a number of factors.
221
00:19:00,373 --> 00:19:09,798
I would say the most uh highly contributing factor in the legal industry to this, how
foreign R &D is, it's the partnership model.
222
00:19:09,958 --> 00:19:16,462
So the partnership model is very much a partnership model that operates on a cash basis.
223
00:19:16,522 --> 00:19:18,823
R &D expenses are accrued.
224
00:19:18,823 --> 00:19:24,192
um Even if your uh tax treatment accelerates that
225
00:19:24,192 --> 00:19:33,048
for tax purposes in general on your internal books, you amortize R &D costs over its
useful life.
226
00:19:33,048 --> 00:19:43,916
um law firm partnerships are very much um about maximizing profits at the end of the year.
227
00:19:43,916 --> 00:19:52,562
And I think that's one of the big hurdles that law firms face when trying to
228
00:19:52,686 --> 00:20:00,446
map their strategy with respect to AI, there's going to be some experimentation and some R
&D that's required.
229
00:20:01,066 --> 00:20:09,986
And focusing too much on immediate ROI, I think is going to limit risk taking and
ultimately hold firms back.
230
00:20:09,986 --> 00:20:12,546
I actually see it every day.
231
00:20:13,806 --> 00:20:19,026
I've done business with about 110 AMLaw firms when I stopped counting.
232
00:20:19,626 --> 00:20:22,046
so I've seen a good cross-sectional view.
233
00:20:22,046 --> 00:20:32,946
I have, talk to firms on a frequent basis where I hear things like we're going to, we're,
going to wait and see because we really can't articulate an ROI today because it's going
234
00:20:32,946 --> 00:20:34,658
to, it's, it's reducing the billable hour.
235
00:20:34,658 --> 00:20:45,047
I would say those firms are more and more starting to be in the minority and most firms
now, especially the big ones get that wait and see is a bad idea.
236
00:20:45,047 --> 00:20:48,343
But yeah, I think the partnership model is a big, a big factor in this.
237
00:20:48,343 --> 00:20:50,884
Well, that's why I was going to ask you, do you think there's change?
238
00:20:50,884 --> 00:20:59,606
Like, because we see ANO with Harvey, like that's definitely some kind of jump into like a
big unknown.
239
00:20:59,806 --> 00:21:07,428
And even in I-Manage, like we see the, for instance, the uptake of Ask I-Manage, which is
our LLM based product.
240
00:21:08,249 --> 00:21:12,870
It's the fastest uptake that we've seen for any of our products before.
241
00:21:12,870 --> 00:21:15,731
And that is firms who want to just...
242
00:21:15,757 --> 00:21:19,670
don't miss out and want to experiment because they're not just buying us.
243
00:21:19,951 --> 00:21:23,434
They're trying different things and seeing what sticks.
244
00:21:23,434 --> 00:21:32,263
And there's quite some in-house initiatives and teams being spun up, at least probably in
the larger law firms that's happening.
245
00:21:32,263 --> 00:21:35,285
uh I would, by the way, definitely encourage that.
246
00:21:35,285 --> 00:21:36,667
So I'm on board with you.
247
00:21:36,667 --> 00:21:41,191
Like, encourage the in-house experiment, set some budget aside for it.
248
00:21:41,869 --> 00:21:46,941
Try different vendors, try software yourself, see what works and don't just write it off.
249
00:21:46,941 --> 00:21:48,612
Like figure out the constraints.
250
00:21:48,612 --> 00:21:49,763
That's really it, right?
251
00:21:49,763 --> 00:21:56,806
These products have certain constraints, figure out what the constraints are, but figure
out within those constraints what you can do with it.
252
00:21:56,946 --> 00:21:58,651
That would be my suggestion.
253
00:21:58,651 --> 00:22:04,935
And it's hard to put in a spreadsheet, the R in the ROI, the return is learning.
254
00:22:05,736 --> 00:22:09,358
And again, that's hard to quantify and put a figure on.
255
00:22:09,358 --> 00:22:18,584
But at the end of the day, if you're not thinking that way, you're going to limit risk
taking.
256
00:22:19,150 --> 00:22:25,229
you're not going to push forward at the pace at which you're going to need to to keep up.
257
00:22:25,229 --> 00:22:27,110
um
258
00:22:27,210 --> 00:22:28,050
in my opinion.
259
00:22:28,050 --> 00:22:37,393
um What about, so, you you in the world of document management, you know, I see a lot of
document management systems.
260
00:22:37,393 --> 00:22:41,464
don't implement, we're partners with iManage for integration purposes.
261
00:22:41,464 --> 00:22:48,736
So in InfoDash, we surface uh iManage content in intranet and extranet scenarios.
262
00:22:48,736 --> 00:22:56,078
um But as a part of that doing that work for the last almost 20 years, I've seen a lot of
law firm DMSs.
263
00:22:56,526 --> 00:22:58,667
And there's very poor data hygiene.
264
00:22:58,667 --> 00:23:14,617
Um, there's been a lot of kind of mergers and acquisitions where you'll get one mess of a
law firms DMS that gets, um, merged into another and they have different, um, different
265
00:23:14,617 --> 00:23:16,477
types of shortcomings.
266
00:23:17,919 --> 00:23:24,102
and it really seems like an overwhelming task for
267
00:23:24,238 --> 00:23:34,958
these law firms to actually straighten that up to, to, and get it to a place where it
makes sense to point AI at a entire DM corpus.
268
00:23:35,098 --> 00:23:36,958
Um, is that your take as well?
269
00:23:36,958 --> 00:23:41,158
mean, it sounds, it feels like you really need a curated data sets.
270
00:23:41,507 --> 00:23:44,927
Well, mean, you definitely take a step back.
271
00:23:44,927 --> 00:23:49,587
You definitely need to do something about the information that you have, right?
272
00:23:49,587 --> 00:24:00,187
mean, legal as an information business, should be, I guess, obvious that managing and
finding that information should be high on the priority list of what you invest in.
273
00:24:00,447 --> 00:24:03,587
That's the simple statement to make.
274
00:24:04,027 --> 00:24:11,317
we definitely very often hear like, can't we throw all those documents that you have in
the DMS and put it in chat GPT and...
275
00:24:11,349 --> 00:24:14,000
and just get amazing results out of it.
276
00:24:14,241 --> 00:24:23,187
that's, I mean, we, hope they're finding out that that doesn't work and everybody kind of,
if you know the technology, that that's not really how it will work.
277
00:24:23,347 --> 00:24:33,594
So getting a good data set is definitely the, I mean, the strategy that as an engineer,
I'll put on my engineering hat is what you need to pursue right now.
278
00:24:33,594 --> 00:24:33,884
Right.
279
00:24:33,884 --> 00:24:40,699
So the, the data that goes in is also the quality of the data that goes in is also the
quality of the data that comes out.
280
00:24:40,699 --> 00:24:41,331
Now.
281
00:24:41,331 --> 00:24:43,992
Search technology has evolved quite a bit.
282
00:24:43,992 --> 00:24:46,913
there's very interesting things that it can do.
283
00:24:46,913 --> 00:24:51,242
mean, there's the AI has brought us the semantic representation.
284
00:24:51,242 --> 00:24:52,534
I mentioned that before, right?
285
00:24:52,534 --> 00:25:00,826
So the words don't get represented as strings anymore, but they get represented by a
mathematical vector that represents the meaning.
286
00:25:00,826 --> 00:25:06,218
We call it the, these embeddings, vector embeddings.
287
00:25:06,218 --> 00:25:11,515
And simply speaking, it makes sure that, like,
288
00:25:11,563 --> 00:25:18,646
force majeure or act of God, very different strings if you look at them, but they are very
close to each other.
289
00:25:18,646 --> 00:25:21,988
Are they exactly the same when you represent them in meaning space?
290
00:25:21,988 --> 00:25:30,311
So we've got this that has helped, but we really need that combined with the traditional
filters so we can have metadata filters.
291
00:25:30,311 --> 00:25:38,804
you say the document should be, I'm looking for something that's written in the last two
years, no meaning vector is going to help you there.
292
00:25:38,804 --> 00:25:40,155
So you need this.
293
00:25:40,155 --> 00:25:43,457
good metadata on it as well.
294
00:25:43,577 --> 00:25:45,608
And we kind of call that hybrid search, right?
295
00:25:45,608 --> 00:25:55,443
So this hybrid search is the joining of the semantic index, which is very interesting,
together with the traditional search index.
296
00:25:55,443 --> 00:25:58,645
And Microsoft has benchmarked that that's the best approach.
297
00:25:58,645 --> 00:26:09,731
If you compare each one individually, pure semantic or pure traditional, you get lower
scores on finding the right information at the right time.
298
00:26:09,847 --> 00:26:13,480
the information you put into it, still the information that will come out of it, right?
299
00:26:13,480 --> 00:26:23,799
So if you put in a document that you would never want anyone to use, it will come out and
if you don't have the right warnings on it, that might, I mean, that might be very
300
00:26:23,799 --> 00:26:24,620
problematic.
301
00:26:24,620 --> 00:26:33,757
But by the way, just digging a little bit deeper on that search, because I kind of like
search, they also found, and I want to give that to you, is they also found that apart
302
00:26:33,757 --> 00:26:38,207
from hybrid search, semantic re-ranking also has
303
00:26:38,207 --> 00:26:39,927
another 10 % uptake.
304
00:26:39,927 --> 00:26:48,410
Semantic re-ranking means that whatever comes back from the search engine, you pass it
over again based on the question that the user has and then change the order.
305
00:26:48,410 --> 00:26:55,171
So you take a look at the top 50 results, instance, and you say, these results are all
good, but this one is actually the one that should be on number one.
306
00:26:55,171 --> 00:27:00,052
I found that really interesting that it has another 10 % uptake in their tests.
307
00:27:00,893 --> 00:27:05,234
And there's technology beyond that like graph, rag, and...
308
00:27:05,731 --> 00:27:10,002
probably don't have time to go into there, but all of these technologies fit really well
with legal.
309
00:27:10,002 --> 00:27:14,213
So law, legal industry should look into it.
310
00:27:14,213 --> 00:27:22,236
How I take it, how you can get to those best information is you should let the system do
as much of the heavy work as possible.
311
00:27:22,456 --> 00:27:30,348
we found out that enriching the data for instance figuring out what doc type you're
dealing with, what contract type or other legal contract type is something machines can do
312
00:27:30,348 --> 00:27:31,619
reliably well.
313
00:27:31,619 --> 00:27:33,645
Pulling out key.
314
00:27:33,645 --> 00:27:42,382
information points like what's the contract date, who are the parties, some other things
that you typically would want to reference in your searches, try to pull it out and put it
315
00:27:42,382 --> 00:27:47,446
as metadata because that's, mean, and again, that's something that can be done by machines
and can be done reliably well.
316
00:27:47,446 --> 00:27:51,969
It's not LLMs doing it, that would be too expensive, but it's doable.
317
00:27:52,390 --> 00:28:03,447
But then you kind of still need to find the gems in there and those gems, I mean, so far
that's still a human process to figure out like this is one of the, you know.
318
00:28:03,447 --> 00:28:07,290
we've closed this deal, these are the gems that we want to keep.
319
00:28:07,290 --> 00:28:15,416
Let's put a label on it and once you've tagged it somehow, the system can know that it
should prefer those to come back.
320
00:28:15,917 --> 00:28:24,483
And that's our strategy, get the data as rich as possible, ensure that you use the AI as
much as possible to enrich it.
321
00:28:24,864 --> 00:28:28,366
We also believe in bringing the AI to the data, right?
322
00:28:28,366 --> 00:28:33,443
So don't pull all the data out, so you lose all the security context, but bring the AI to
it.
323
00:28:33,443 --> 00:28:40,143
It doesn't have to be IMAGE AI, can be other vendors and then bring all of the smarter
search technology on top of it.
324
00:28:40,443 --> 00:28:43,863
But I've said that all of that, that's my engineering hat, right?
325
00:28:43,863 --> 00:28:56,683
If you think about the science hat, then I do have to say that every time that we've
anything, I mean, comparing to symbolic AI, I don't know if you know what symbolic AI is.
326
00:28:56,683 --> 00:29:00,243
We had this symbolic AI where you try to build rule sets for everything, right?
327
00:29:00,243 --> 00:29:02,963
So we had language translation.
328
00:29:03,043 --> 00:29:06,295
10 years, 15 years ago that was done by software with rules.
329
00:29:06,295 --> 00:29:14,869
Essentially they wrote rules of how English should be translated into French and somebody
managed and maintained and curated all those rules.
330
00:29:15,190 --> 00:29:26,316
But then at some point, what we call the connectionist AI, trained a model by looking at
French and English texts and figuring out what those rules were internalized into a model.
331
00:29:26,316 --> 00:29:32,449
And you can't really look at how it does it, but it does it when we do the benchmark, we
see that it does it.
332
00:29:32,471 --> 00:29:35,953
vastly superiorly well than the traditional rule based one.
333
00:29:35,953 --> 00:29:45,660
And that's the same for grammar correction systems, code generation, I guess now we had
code generations before, or transcription or transcription as well.
334
00:29:46,080 --> 00:29:51,624
These sound bytes where then transcribed into words.
335
00:29:51,624 --> 00:29:59,199
So all of these technologies we've seen that the symbolic version has been surproceeded by
connectionist one.
336
00:29:59,199 --> 00:30:01,781
So I'm just saying,
337
00:30:01,781 --> 00:30:04,953
Right now as an engineer and a product manager, that's what we have to do.
338
00:30:04,953 --> 00:30:08,515
We have to really curate those sets, but five years from now, it could be very different.
339
00:30:08,515 --> 00:30:09,896
And I don't know what it will look like.
340
00:30:09,896 --> 00:30:19,022
Maybe it's the machine doing the curation for us, or it just doesn't need it anymore
because it sees, as long as it has all the information to make the determination, it sees
341
00:30:19,022 --> 00:30:20,002
all of it.
342
00:30:20,183 --> 00:30:23,705
But there is a chance, of course, that the connectionist model overtakes it.
343
00:30:23,705 --> 00:30:26,120
em Just...
344
00:30:26,120 --> 00:30:28,306
That's kind of the Elon Musk.
345
00:30:28,306 --> 00:30:32,247
um Yeah, that's his theory as well.
346
00:30:32,247 --> 00:30:37,131
think I'm not as optimistic about timelines as Elon might be.
347
00:30:37,131 --> 00:30:39,833
That's just the feasibility of it.
348
00:30:39,833 --> 00:30:43,315
em You don't really know.
349
00:30:43,956 --> 00:30:57,415
A very interesting benchmark, I'm not a benchmark person, but a very interesting thing to
track is again on meter is the duration of tasks as done by human, the AI can do with high
350
00:30:57,415 --> 00:30:58,446
reliability.
351
00:30:58,446 --> 00:31:00,097
That's maybe a bit.
352
00:31:00,289 --> 00:31:11,662
difficult sentence, essentially means like, so the, how well an AI can do a task which
takes a certain amount of minutes for a human to do, right?
353
00:31:11,662 --> 00:31:14,383
And they track how well, how that's evolving.
354
00:31:14,383 --> 00:31:22,386
So let's say, m me doing a Google query and looking at the result, that's a task that
takes me about 30 seconds, right?
355
00:31:22,386 --> 00:31:27,387
em Replying to an email em or writing a...
356
00:31:27,651 --> 00:31:30,451
one class of code takes me about four minutes.
357
00:31:30,451 --> 00:31:34,551
So you could say this, these are increasingly more complex tasks.
358
00:31:34,791 --> 00:31:38,411
Some tasks take a human an hour to do or take four hours to do.
359
00:31:38,411 --> 00:31:47,191
And what they do is they let the machine or they benchmark how well an LLM or some AI does
performs at these tasks.
360
00:31:47,191 --> 00:31:57,871
So right now they got to the point that six to 10 minutes tasks can be done with high
success rate by LLMs.
361
00:31:57,891 --> 00:32:03,331
And that's been the length of that duration of the task has been doubling every seven
months.
362
00:32:03,331 --> 00:32:12,011
So every seven months, so within seven months, you could expect it to go to tasks that
would take us 15 minutes, but then seven months later, it's 30 minutes, right?
363
00:32:12,011 --> 00:32:22,431
And at some point you kind of have quite an individual or let's say autonomous AI doing
work.
364
00:32:22,811 --> 00:32:26,699
So, I mean, so it's again, it's benchmark and it's a
365
00:32:26,699 --> 00:32:32,832
It's a forecast and you know, you can't trust forecast, but I think it's a very
interesting one that we've been tracking.
366
00:32:32,832 --> 00:32:49,970
that one, I think will matter as to, you know, whether these curation problems can be
solved or if complex legal tasks will be fixable, will be doable, I mean, by AIs.
367
00:32:49,970 --> 00:32:53,481
So, I think that's one to keep an eye for.
368
00:32:54,178 --> 00:32:55,199
Super interesting.
369
00:32:55,199 --> 00:33:01,863
um What are your thoughts on how far we can get?
370
00:33:01,863 --> 00:33:17,792
And I don't know down what path, uh you know, whether that's AGI or ASI or whatever's next
after that with the current kind of LLM transformer architecture.
371
00:33:17,792 --> 00:33:21,574
It seems to me like they're, it's not going to
372
00:33:21,612 --> 00:33:23,284
This isn't the end state.
373
00:33:23,284 --> 00:33:27,767
This is an intermediate state to whatever's next.
374
00:33:27,767 --> 00:33:30,230
And I have no idea what that might look like.
375
00:33:30,230 --> 00:33:38,797
But there are just some shortcomings with this particular approach that we've managed to
have really good workarounds.
376
00:33:38,797 --> 00:33:49,606
Like uh maybe a year ago, um I would sit here and tell you that LLMs can't reason, that
they don't understand
377
00:33:50,243 --> 00:33:53,205
the question or the prompt that you put in there.
378
00:33:53,205 --> 00:33:59,541
And there's been a lot of workarounds, you know, with inference time compute and, um, that
have worked around that.
379
00:33:59,541 --> 00:34:06,559
Well, I don't know that I could sit here and say that today because the output looks so
convincing, but
380
00:34:06,559 --> 00:34:07,550
I had the same thing.
381
00:34:07,550 --> 00:34:11,081
had the, and they can reach out to tools, right?
382
00:34:11,081 --> 00:34:12,611
That's also something we've given them.
383
00:34:12,611 --> 00:34:15,492
There's an ability that they can call out to other tools.
384
00:34:15,492 --> 00:34:18,483
For instance, they were never very good at doing maths.
385
00:34:18,483 --> 00:34:20,884
Simple calculations couldn't be done.
386
00:34:20,884 --> 00:34:26,345
Probably also related to this entire representation problem, em but they couldn't do it.
387
00:34:26,345 --> 00:34:31,747
And then now they could just reach out to a calculator and do the calculations and pull
the results back and use it.
388
00:34:31,747 --> 00:34:31,967
Right?
389
00:34:31,967 --> 00:34:32,707
So.
390
00:34:32,899 --> 00:34:34,439
All right, but that wasn't your problem.
391
00:34:34,439 --> 00:34:39,979
The problem is can LLMs fundamentally do semantic reasoning tasks, right?
392
00:34:39,979 --> 00:34:42,699
And take that very far.
393
00:34:42,959 --> 00:34:48,799
I think that is one of the best questions to ask and also one of the hardest ones to
answer.
394
00:34:49,059 --> 00:34:54,359
My mind is like, so I've always said, no, it can't be done.
395
00:34:54,619 --> 00:34:57,859
it's a, LLMs are curve fitting.
396
00:34:58,119 --> 00:35:02,719
So they see a lot of, they've seen a lot of data on the internet and they fit the curve.
397
00:35:03,303 --> 00:35:03,923
on that.
398
00:35:03,923 --> 00:35:08,164
So the curve fitting is only as good as the data it has seen.
399
00:35:08,164 --> 00:35:12,346
And the only thing they can come up with is something that is somewhere on that curve.
400
00:35:12,346 --> 00:35:14,506
So they can't think out of that box.
401
00:35:14,586 --> 00:35:29,620
And we as humans, I think, prove that we don't have to see, just to go a bit further on
that, they might still amaze us every day because they come up with an answer that amazes
402
00:35:29,620 --> 00:35:32,531
us that we wouldn't know, for instance.
403
00:35:33,325 --> 00:35:39,488
But if you would have seen all the data that they've seen, maybe that answer doesn't
actually amaze you, right?
404
00:35:39,488 --> 00:35:42,499
So their answers are always interpolations.
405
00:35:42,499 --> 00:35:45,000
They can't come up with something novel.
406
00:35:45,120 --> 00:35:48,361
And we seem to be, as humans, be able to come up with something novel.
407
00:35:48,361 --> 00:35:50,362
That's at least what it seems like.
408
00:35:50,582 --> 00:35:56,025
But we also have a much bigger brain capacity than the LLMs have.
409
00:35:56,025 --> 00:36:02,091
It's hard to estimate, but it's definitely more than a factor of 1,000, maybe a factor of
10,000.
410
00:36:02,243 --> 00:36:07,064
uh more complex than our brain is and the LLM brain is.
411
00:36:07,384 --> 00:36:17,507
But it seems that sticking to my point is I don't think LLMs can fundamentally do
reasoning indefinitely.
412
00:36:17,507 --> 00:36:22,989
They can pretend to do it with all the data we've seen, but they can't actually think
outside of the box.
413
00:36:22,989 --> 00:36:26,510
Not until this curve fitting problem is solved.
414
00:36:26,510 --> 00:36:27,560
But that can be solved.
415
00:36:27,560 --> 00:36:31,671
There's other algorithms like genetic algorithms.
416
00:36:31,969 --> 00:36:33,780
which do not have that constraint.
417
00:36:33,780 --> 00:36:47,248
So maybe a combination or change of the architectures, bringing in some genetic evolution
into it, genetic algorithms into it, or some other technology might bring us to that next
418
00:36:47,248 --> 00:36:49,849
level that we need.
419
00:36:50,190 --> 00:37:00,175
I definitely think that this, and that's not a very original thing to say, but the worst
LLMs or the worst AIs we've seen are the ones that we see today.
420
00:37:00,461 --> 00:37:01,412
They do evolve.
421
00:37:01,412 --> 00:37:04,334
do think we'll need a scientific incremental step.
422
00:37:04,334 --> 00:37:06,106
It's not just going to be the same technology.
423
00:37:06,106 --> 00:37:09,818
We'll need a new step to really get to AGI.
424
00:37:09,859 --> 00:37:13,002
But that doesn't mean that's not useful with the state it is.
425
00:37:13,002 --> 00:37:24,431
So we've all really realized, I think, that you can give it a document and can summarize
it really well or answer questions on that document really well or use snippets of it and
426
00:37:24,431 --> 00:37:27,534
then compose something new or compose an email.
427
00:37:27,534 --> 00:37:29,101
So there's definitely...
428
00:37:29,101 --> 00:37:34,869
within the constraints it has, there's definitely value we can get out of it.
429
00:37:35,362 --> 00:37:49,028
Yeah, I think to take it to that next level, to start curing diseases and uh really making
breakthroughs in science, the ability to come up with novel concepts, um like you said, it
430
00:37:49,468 --> 00:37:58,732
can reassemble the existing Lego blocks it's been trained on to create new structures of
information.
431
00:37:58,732 --> 00:38:03,064
But in terms of something outside of that universe of
432
00:38:03,064 --> 00:38:05,196
things that seem like a mathematical proof.
433
00:38:05,196 --> 00:38:12,051
Finding a novel approach, I was a math major undergrad, so, and I struggled in advanced
calculus with proofs.
434
00:38:12,051 --> 00:38:14,263
That was a very humbling experience for me.
435
00:38:14,263 --> 00:38:32,118
um I'm great at, you know, um differential equations, matrix theory, all the stuff where
you're solving equations, but proofs require such a abstract lens and thinking that
436
00:38:32,334 --> 00:38:36,483
um And I don't think LLMs are ever gonna get there.
437
00:38:36,483 --> 00:38:40,043
That limitation is embedded in their design, correct?
438
00:38:40,043 --> 00:38:40,943
That's correct.
439
00:38:40,943 --> 00:38:42,564
Yeah, I think that's correct.
440
00:38:43,124 --> 00:38:48,225
By the way, I start to seem like the benchmark guy, but another interesting one.
441
00:38:48,225 --> 00:39:00,368
So there's all these bar exam benchmarks, but they test for a lot of knowledge that the LM
might have seen on the internet.
442
00:39:00,549 --> 00:39:08,251
There's Francois Cholet, uh he's the person who started the em Keras library, I think.
443
00:39:08,251 --> 00:39:09,971
So one of the AI
444
00:39:10,019 --> 00:39:15,431
programming, the libraries you would use if you would be writing a low level AI code in
Python.
445
00:39:15,431 --> 00:39:18,943
He was the original author of that product.
446
00:39:18,943 --> 00:39:26,726
And he's also very verbose about and explicit about the fact that he doesn't believe that
the LLMs will take us there.
447
00:39:26,726 --> 00:39:30,168
And to prove it, he's created this Arc challenge.
448
00:39:30,168 --> 00:39:36,710
And the Arc contest is a benchmark, but with challenges that the LLM definitely hasn't
seen.
449
00:39:36,710 --> 00:39:38,371
So they come up with...
450
00:39:38,403 --> 00:39:44,763
challenges, which are very abstract visual challenges that as a human are super simple to
solve.
451
00:39:44,763 --> 00:39:49,483
Like we'll nail 99 % of them without an issue.
452
00:39:49,483 --> 00:39:57,423
But the LLM score maybe 2, 3 % on the current ARC2 benchmark.
453
00:39:57,463 --> 00:40:06,183
So he thinks that that's a true benchmark for novel thinking, maybe for the path towards
general intelligence.
454
00:40:07,127 --> 00:40:17,422
And that's also an interesting too, and definitely an interesting person to listen to and
to interview if you would ever be able to get him on the podcast.
455
00:40:17,612 --> 00:40:20,864
Yeah, no, that sounds super interesting.
456
00:40:20,864 --> 00:40:26,368
Yeah, I love getting uh detailed with this.
457
00:40:26,368 --> 00:40:27,819
I've had guests on the show.
458
00:40:27,819 --> 00:40:33,963
In fact, had a uh colleague, or former colleague of yours, Jack Shepard, um on the
podcast.
459
00:40:33,963 --> 00:40:37,776
And we were talking about the legal reasoning question.
460
00:40:37,776 --> 00:40:44,020
Or I'm sorry, not legal reasoning, just LLM's ability to reason and whether or not they
truly comprehend.
461
00:40:44,141 --> 00:40:46,582
And his comment was, it's
462
00:40:46,582 --> 00:40:48,144
It doesn't really matter.
463
00:40:48,144 --> 00:40:50,485
And, um, this is about a year ago.
464
00:40:50,485 --> 00:40:56,931
So this is before, um, three and these reasoning models came on the scene.
465
00:40:56,931 --> 00:41:10,523
And my, my, my rebuttal to that was, well, I think it, it, does matter in understanding
these limitations and because that helps influence how you apply the technology to a
466
00:41:10,523 --> 00:41:11,984
problem domain.
467
00:41:12,165 --> 00:41:12,465
Right.
468
00:41:12,465 --> 00:41:13,666
If you know,
469
00:41:13,698 --> 00:41:30,082
that it truly can't reason and come up with novel ideas, you're going to be better
equipped to um deploy it in a way that's going to lead to success.
470
00:41:30,082 --> 00:41:34,784
I um think it is important for us to understand these limitations.
471
00:41:34,784 --> 00:41:39,685
And also, I'm not a lawyer, but I've had many lawyers on this show.
472
00:41:39,685 --> 00:41:43,366
And they all consistently say that, uh
473
00:41:43,424 --> 00:41:47,417
Legal is a past looking discipline, right?
474
00:41:47,417 --> 00:41:48,048
Everything.
475
00:41:48,048 --> 00:41:51,701
So it doesn't really have to come up with new.
476
00:41:51,861 --> 00:41:53,088
Now there are, are.
477
00:41:53,088 --> 00:42:08,415
So when, um, Marty Lipton came up with the poison pill concept in the 1980s as a mechanism
to deter hostile takeovers, um, that was a new approach.
478
00:42:08,415 --> 00:42:11,874
Could, could an LLM piece that together?
479
00:42:11,874 --> 00:42:13,275
That's a good question.
480
00:42:13,615 --> 00:42:24,723
I don't know because it was, he used existing mechanisms to create, you know, that
probably would exist in an LLM's dataset training data.
481
00:42:24,723 --> 00:42:29,986
So could an LLM come up with a new poison pill um approach?
482
00:42:30,467 --> 00:42:40,095
Well, eh it comes up with interesting ideas, So fundamentally, I think it can't come up
with a truly novel idea.
483
00:42:40,356 --> 00:42:45,840
the mathematical proof is the perfect example of where it actually completely falls true.
484
00:42:46,461 --> 00:42:54,728
It kind of depends a little bit on what it means to assemble something together and how
much novelty there is truly in that poison pill.
485
00:42:54,728 --> 00:42:58,211
And I don't really know that well enough to...
486
00:42:58,211 --> 00:42:58,733
m
487
00:42:58,733 --> 00:43:07,050
don't know all the prior examples of that to make a good prediction whether that's
possible.
488
00:43:07,050 --> 00:43:19,300
I guess another thing is if you come up with a complex problem and you want it to plan out
what it should be doing, so we've got this agent technologies, it's not always great at
489
00:43:19,300 --> 00:43:25,666
making the plan and then following through on the plan and definitely not good at seeing
where its plan goes wrong.
490
00:43:25,666 --> 00:43:27,887
I think that's part of this.
491
00:43:27,917 --> 00:43:31,809
this incapacity to truly, truly grasp what's going on.
492
00:43:31,809 --> 00:43:32,059
Right.
493
00:43:32,059 --> 00:43:39,692
So if it's more than just a string manipulation, which is going on, you kind of lose a
certain meaning to it.
494
00:43:39,692 --> 00:43:43,333
Having said that we've been proven over and over wrong.
495
00:43:43,333 --> 00:43:49,196
And we see more and more examples of more complex reasoning being done by LMS.
496
00:43:49,196 --> 00:43:49,406
Right.
497
00:43:49,406 --> 00:43:53,697
So, and it's interesting, this is all empirical.
498
00:43:54,738 --> 00:43:57,439
Contrary to the software algorithms that you wrote in
499
00:43:57,439 --> 00:44:01,940
In basic, somebody could just go in and figure out what was that doing here, right?
500
00:44:01,940 --> 00:44:04,202
And see why it can't or can't do it.
501
00:44:04,202 --> 00:44:06,783
This is not the case for the LLMs.
502
00:44:06,783 --> 00:44:12,045
We really have to empirically test them as if they're a black box and see if, you know.
503
00:44:12,045 --> 00:44:19,529
So even the greatest minds, the biggest experts, if you ask Jan Le Koon or Hinton, they
will have different opinions.
504
00:44:19,529 --> 00:44:22,330
And, you you would think these guys will probably just see it.
505
00:44:22,330 --> 00:44:25,691
They know the technology in and out, but it's not that simple.
506
00:44:25,730 --> 00:44:26,090
Yeah.
507
00:44:26,090 --> 00:44:29,492
And they all have wildly different assessments.
508
00:44:29,492 --> 00:44:36,315
I Jan is, I would say, the most bearish, uh skeptical.
509
00:44:36,315 --> 00:44:46,470
um I think he likes making press and press-worthy statements, you know, that AI is not
even as smart as a house cat.
510
00:44:46,470 --> 00:44:49,461
you know, those things create headlines, and that gets him attention.
511
00:44:49,461 --> 00:44:50,561
And I think he likes that.
512
00:44:50,561 --> 00:44:52,364
um But...
513
00:44:52,364 --> 00:44:53,404
I know we're almost out of time.
514
00:44:53,404 --> 00:44:58,116
I have a final question for you though, um which I think is a really important one for our
listeners.
515
00:44:58,116 --> 00:45:03,979
So we cater primarily to like knowledge management, innovation professionals and large law
firms.
516
00:45:03,979 --> 00:45:17,214
And I'm wondering where, what is, where does the future lie in knowledge management, you
know, which is the discipline where you kind of curate and, you know, identify and create
517
00:45:17,214 --> 00:45:21,698
and maintain repositories of model or precedent documents.
518
00:45:21,698 --> 00:45:30,557
that are those examples, it kind of reminded me of what you talked about, the rules-based
approach to language translation.
519
00:45:30,557 --> 00:45:37,683
And will we get to a place where the technology can do that?
520
00:45:37,683 --> 00:45:41,239
What are your thoughts on that?
521
00:45:41,239 --> 00:45:49,001
Yeah, I mean, we've touched on that slightly before, But I think we are not there at the
moment.
522
00:45:49,001 --> 00:45:57,364
There's not even a forecast, like an outlook that that's going to be the case that, you
you could just train a model and have that job handled.
523
00:45:57,364 --> 00:46:02,245
So I would say let's now be very realistic and know the current limitations.
524
00:46:02,245 --> 00:46:03,515
Same message, right?
525
00:46:03,515 --> 00:46:05,526
Find the applications that work.
526
00:46:05,886 --> 00:46:08,887
The knowledge industry can definitely benefit from AI.
527
00:46:08,887 --> 00:46:10,261
I mean, it's just...
528
00:46:10,261 --> 00:46:19,358
undoubtedly, There's probably still some discovery going on about what it can do and how
far it can do it reliably, but it can do it right now.
529
00:46:19,638 --> 00:46:25,102
Now that outlook, that horizon, where we'll be moving towards, will it be possible?
530
00:46:25,102 --> 00:46:28,885
My personal hunch is that yes, it will be.
531
00:46:28,885 --> 00:46:36,891
I've seen too many examples of connectionist models seeing the
532
00:46:36,931 --> 00:46:41,852
I guess the forest through the trees and figuring it out at some point at a level of
complexity.
533
00:46:41,852 --> 00:46:44,173
I don't see why that wouldn't be the case.
534
00:46:45,593 --> 00:46:53,035
hardest thing will be to figure out what the timeline is for that and the complexity of
the models and the cost associated to running them.
535
00:46:53,035 --> 00:46:55,816
Now, interestingly enough, we have, I think, upper limit, right?
536
00:46:55,816 --> 00:46:59,007
Our brain is embedded in this physical world.
537
00:46:59,007 --> 00:47:00,797
It is computer.
538
00:47:00,837 --> 00:47:04,158
It's pretty cheap to run in terms of energy capacity.
539
00:47:04,158 --> 00:47:06,939
em So there is definitely...
540
00:47:07,437 --> 00:47:17,099
we should at some point achieve something that, I mean, that's the upper limit that we,
the upper limit, that is a limit of, lower limit of the costs that we should achieve at
541
00:47:17,099 --> 00:47:18,601
some point.
542
00:47:18,942 --> 00:47:21,805
I'm bullish on that being the case.
543
00:47:21,805 --> 00:47:26,234
I just don't know when, if that's not too vague of an answer.
544
00:47:26,234 --> 00:47:26,794
I get it.
545
00:47:26,794 --> 00:47:35,811
And then, you know, um I'm very bullish on knowledge management's need, at least in the
near to midterm.
546
00:47:35,811 --> 00:47:37,352
It's more than ever.
547
00:47:37,352 --> 00:47:43,977
Like, as we transition out of this billable hour model, which we're going to, uh we're
going to go kicking and screaming.
548
00:47:43,977 --> 00:47:44,757
it's
549
00:47:45,312 --> 00:47:48,255
it will still play a role in how things get priced.
550
00:47:48,255 --> 00:47:56,574
But at the end of the day, I don't think customers are going to pay for time like they
used to given these new technology advancements.
551
00:47:56,574 --> 00:48:02,740
I think that puts uh knowledge management in a position where they can really drive bottom
line performance.
552
00:48:02,740 --> 00:48:08,979
um And that's going to be really important to the business.
553
00:48:08,979 --> 00:48:18,159
think we'll see a lot of potential of automation that's driven by access to good knowledge
assets.
554
00:48:18,159 --> 00:48:32,059
So you'll get great automation on starting from a knowledge asset, finding some additional
inputs and getting to a close to an output product as long as you have a clear sight on
555
00:48:32,059 --> 00:48:34,719
what those good assets are.
556
00:48:34,719 --> 00:48:37,019
I'm with you.
557
00:48:37,207 --> 00:48:38,609
Put the investment there now.
558
00:48:38,609 --> 00:48:44,234
Put the investment in finding the information, enriching them, searching the search
technology to find them.
559
00:48:44,295 --> 00:48:50,962
And then I would say experiment with AI to see what automation you can drive on top of
that in the actual legal flow.
560
00:48:51,800 --> 00:48:52,630
Yeah.
561
00:48:52,871 --> 00:48:55,565
Well, this has been a fantastic conversation.
562
00:48:55,565 --> 00:48:57,157
I've really enjoyed it.
563
00:48:57,157 --> 00:49:02,342
And em I appreciate you spending a few minutes with us here today.
564
00:49:04,025 --> 00:49:05,046
Yeah.
565
00:49:05,407 --> 00:49:08,009
Are you going to be at Ilticon this year?
566
00:49:08,117 --> 00:49:09,870
I will not be at Elta.com.
567
00:49:09,870 --> 00:49:11,452
I'm on holiday.
568
00:49:11,452 --> 00:49:19,113
I regret that now, but I'll find some opportunity to meet you in real life so we can
continue this conversation.
569
00:49:19,162 --> 00:49:20,384
absolutely.
570
00:49:20,384 --> 00:49:21,326
OK, great.
571
00:49:21,326 --> 00:49:24,873
Well, thanks again, and we'll catch up soon.
572
00:49:25,575 --> 00:49:27,178
All right, thanks, John. -->
Subscribe
Stay up on the latest innovations in legal technology and knowledge management.