chatbots
HOW TO MAKE A CHAT BOT
chatbots |
All
7 billion people on earth</i> <i>would have the capability of
learning anything much faster.</I> <I>The web democratize information
and this next evolution</I> <I>will democratize something just as
important, guidance.</I> <I> The ideal chat boot can talk
intelligently about any domain.
</I> <I>That's the holy grail, but
domain specific chat</I> <I>are definitely possible.</I> <I>The
technical term for this Isa question answering system.</I>
<i>Surprisingly, we've been able to do this since way back</i>
<i>in the '70s.
</i> <i> Lunar was one of
the first.</i> <i>It was, as you might have guessed, rule
based,</i> <i>so it allowed geologists task questions about moon
rocks</i> <i>from the Apollo missions.</i> <i>
A
later improvement to rule based Q&A systems, allowed</i>
<i>programmers to encode patterns into their boot</i>
<i>called artificial intelligence markup language, or AIML.</i>
<i> That meant less code for the same results.</i> <i>But
yeah, don't use AIML.</i> <i>It's so old it makes Nuka look
new.</i> <i>Now with deep learning, we can do this</i>
<i>without hard coded responses and have much better results.</i>
<i>
how to make chatbots in python
The
generic case</i> <i>is that you give it some DAX as
input,</i> <i>and then asking a question.</i> <i>It'll
give you the right answer after logically reasoning</i> <i>about
it.</i> <i>The input could also be that everybody is
happy.</i> <i>And then the question could be what's the
sentiment?</i> <i>The answer would be positive.
</i>
<i>Other possible questions are what's the entity?</i>
<i>What are the part of speech tags?</i> <i>What's the
translation to French?</i> <i>We need a common model for all of
these questions.</i> <i> when</i> <i>they released a paper
introducing this really cool</i> <i>idea called a memory network.</i>
<i>LSTM networks proved to be useful tool in tasks like tech</i>
<i>summarization,
Their memory, encoded by hidden
states</i> <i>and weights, is too small for very, very long
sequences</i> <i>of data.</i> <i>Be that a book or a
movie.</i> <i>A way around this for language translation,</i>
<i>for example, what's to store multiple LSTM states,</i>
<i>and use an attention mechanism to choose between them.
</i>
<i>But they developed another strategy</i> <i>that
outperformed LSTM's for Q&A systems.</i> <i> The idea was to allow a neural network</i> <i>to use an external data
structure as memory storage.</i> <i>It learns where to retrieve the
required memory from the memory</i> <i>bank in a supervised
way.</i> <i>
When
it came to answering questions</i> <i>from POI data that was
generated,</i> <i>that info was pretty easy to come by.</i>
<i>But in real world data, it is not that easy.</i> <i>Most
recently, there was a foul month long Cagle contest</i> <i>that a
startup called Meta Mind placed in the top 5% for.
</i> <i> To do
this they built a new state of the art model</i> <i>called a
dynamic memory network that built</i> <i>on Face book’s initial
idea.</i> <i>That's the one we'll focus on, so let's build
it</i> <i>programmatically using Keas.</i> <i>(saran
(voiceover)) This dataset is pretty well organized.</i> <i>It was
created by Face book AI research</i> <i>for the specific goal of
improving textual reasoning.</i> <i>It's grouped into20 different
tasks.</i> <i>Each task test a different aspect of reasoning.</i>
<i>
how to make chatbots in java
So,
overall it provides good overview</i> <i>of all the different
capabilities of your learning</i> <i>model.</i>
<i>There are 1,000questions for training,</i> <i>and 1,000
for testing per task.</i> <i>Each question is paired with a
statement,</i> <i>or series of statements, as well as an
answer.</i> <i>The goal is to have one model that can succeed in
all tasks</i> <i>easily.</i> <i>We'll use pre-trained
Glove vectors</i> <i>to help create sequence of word
vectors</i> <i>from our input sentences.</i> <i>And
these vectors will acts inputs to the model.
</i> <i>The daemon architecture
defines two types of memory.</i> <i>Semantic, and
episodic.</i> <i>These input factors are considered the semantic
memory,</i> <i>whereas, episodic memory might contain other knowledge
as</i> <i>well.</i> <i>And we'll talk about that in a
second.
</i> <i>We can fetch our Babel data set from the
web,</i> <i>and split them into training and testing
data.</i> <i>Glove will help convertor words to vectors,</i>
<i>so they're ready tube fed into our model.</i> <i>The first
module, the input module,</i> <i>is a GRU, or gated recurrent unit,
chatbot |
</i>
<i>that runs on a sequence of word vectors.</i> <i>A GRU cell
is kind of like an LSTM cell,</i> <i>but it's more computationally
efficient since it only</i> <i>has two gates, and it doesn’t use a
memory unit.</i> <i>The two gates control when its content is
updated,</i> <i>and when it's erased.</i>
<i>Update.</i> <i>Reset.</i> <i>Update.</i>
<i>Reset.</i> <i>Update.</i> <i>Reset.
</i>
<i>And the hidden state of the input module</i> <i>represents
the input process, so far in a vector.</i> <i>It outputs hidden states
after every sentence,</i> <i>and these outputs are called facts in
the paper,</i> <i>because they represent the essence of what is
fed.</i> <i>Given a word vector and the previous time step
vector,</i> <i>we'll compute the current time step
vector.</i> <i>The update gate is a single layer neural
network.</i> <i>We sum up the matrix multiplications,</i>
<i>and add a biased term.</i> <i>And then the sigmoid
squashes it to a list</i> <i>of values between 0 and1, the output
vector.
</i> <i>We do this twice with different sets of
weights,</i> <i>then we use a reset gate that will</i>
<i>learn to ignore the past time steps when necessary.</i>
<i>For example, if the next sentence</i> <i>has nothing to do
with those that came before it.</i> <i>
The
update gate is similar in that it</i> <i>can learn to ignore the
current time step entirely.</i> <i> Maybe the current sentence has
nothing to do with the answer.</i> <i>Whereas, previous ones
did.</i> <i>(saran (voiceover)) Then, there’s the question
module.</i> <i>It processes the question word by word,</i>
<i>and outputs a vector using the same GRU as the input module,</i>
<i>and the same weights.</i> <i>We can encode both of them by
creating</i> <i>embedding layers for both.
</i> <i>Then
we'll create an episodic memory representation for both.</i> <i>
The motivation for this in the paper,</i> <i>came from the hippocampus
function in our brain.</i> <i>It's able to retrieve temporal states
that</i> <i>are triggered by some response, like a sight or a
sound.</i> <i>(saran (voiceover)) Both the fact and
question</i> <i>vectors that are extracted from the input</i>
<i>enter the episodic memory module.</i> <i>It's composed of
two nested GRU's.</i> <i>The inner GRU generates what are called
episodes.</i> <i>It does this by passing over the facts from the
input module.
</i> <i>When updating its interstate, takes into account</i> <i>the
output of attention function on the current fact.</i> <i>The
attention function gives score between 0 and 1</i> <i>to each
fact.</i> <i>
chatbot |
And
so, the GRU ignores facts with low scores.</i> <i>After each full passion
all the facts,</i> <i>the inner GRU outputs an episode
which</i> <i>is then fed to the outer GRU.</i> <i>The
reason we need multiple episodes,</i> <i>is so our model can learn
what part of a sentence</i> <i>it should pay attention to after
realizing after one pass,</i> <i>that something else is
important.</i> <i>With multiple passes,
0 Comments:
Post a Comment