Linked Resources: Frédéric’s Twitter (@CoperniX) Word embeddings Word2Vec Visualization Andrew Ng’s deep learning course Bing’s new indexing API Frédéric’s TechSEO […]


The post 2. Interview w/ Frédéric Dubut, Bing appeared first on TechnicalSEO.com.

Linked Resources:

Frédéric’s Twitter (@CoperniX)Word embeddingsWord2VecVisualizationAndrew Ng’s deep learning courseBing’s new indexing APIFrédéric’s TechSEO Boost presentation (w/Christi Olsen)Frédéric’s article on ML and guidelinesMartin Splitt‘s Article on JSPlaylist for Martin’s JS Series BM25 (i.e., more advanced TF-IDF)

Topic Timestamps:

[0:15] intros
[2:45] why the relationship with the search community
[4:05] how can webmasters help Bing (remember Bing!)
[5:45] why is the community focused on JavaScript?
[9:00] frederic’s techSEO boost talk
[12:15] what should SEOs know relating to machine learning?
[14:15] trust, bing, and spam
[15:30] Bing’s approach on dealing with spam
[16:45] importance of high quality results (especially top results)
[17:15] search and SEO community relationship
[20:15] what makes a strong eCommerce site? (trust)
[22:15] bing on accepting feedback
[27:15] internationalization in bing
[28:15] why word vectors
[32:15] related content hubs
[33:15] more word vector stuff
[35:45] Karen Jones -IDF
[37:45] 3 pieces of wisdom
1. remember that we build sites and products for people
2. take Andrew Ng’s deeplearning course
3. sign up for BWT (submit URLs or use their brand new API)
[44:15] closing

Favorite Quotes:

“In the end, we build all of this for people. ““One of the one of the way we frame it here (at Bing is) if you look at all the eCommerce websites on the Internet, one question we asked ourselves is, would we give our credit card numbers to that website?”“It comes from the fact that our users really trust us to serve the best and most authoritative results.”“Because when we fail it has real life consequences for these people. “” We are an industry where we are builders. We build websites, we build products, we build the search engine. We all build these things for people.”

Transcript:

Note 1: Add about ~15 seconds to timestamps to account for intro.

Note 2: If you see notice any major errors, please reach out to seointhelab [at] merkleinc.com, we tried our best to stay true to the vocal version.

[00:00:02] Alexis Sanders:
Hello. Hello. And welcome back to the podcast. Today we have Frederic Dubut
from Bing as well as Max from Merkle. Max, would you like to give an
introduction of yourself first?

[00:00:12] Max Prin: Sure.
Thanks, Alexis. My name is Max. I lead the technical SEO team here at Merkle
and we focus on the most technical aspects of SEO, such as structural data and
crawling and indexing.

[00:00:25] Alexis:
And then Frédéric.

[00:00:26] Frédéric Dubut: I
am Frédéric Dubut. I’m part of the Web ranking and quality team here at Bing,
with the specific focus on anti-spam, anti-malware, and all the bad stuff.

[00:00:37] Alexis:
Awesome! And one of the things I found in my research of you, Frédéric, is that
you speak five different languages. How did that even happen?

[00:00:46] Frédéric:
Well, I don’t speak them very well. And really, I’m truly proficient in French
and English. And then they said practice makes perfect and for language. I
think like the lack of practice makes you forget very, very fast in France. And
Max went through the same system that I believe you have to start studying
foreign language when you are like, ten or eleven or so. And you have to say,
like seven or nine years of language. So And you have to take two of them. So I
picked English and Spanish. That’s why there’s two. And then I was interested
in Japan in general. So I learned a little Japanese lived there a little while,
and then I start toward Zurich. So I had to pick up a little bit of German.
Here are your five languages.

[00:01:29] Alexis:
Wow! So we could do this podcast in, like, totally in French, probably with you
and Max. And I would just listen in… I’m just kidding.

[00:01:37] Frédéric: exactament.

[00:01:40] Max: Yeah,
because after several years living in the U. S. You forget your French. That’s
very hard for me.

[00:01:52] Alexis: I
imagine that’s so true. Gosh…

[00:01:56] Max: I
could actually, actually a hard time like talking about SEO in French. So
everything is in English.

[00:02:04] Alexis: Is
it just the work terminology?

[00:02:05] Max:
Yeah, everything. All the key terms, Everything is in English.

[00:02:09] Alexis: Ah, that’s so interesting! And fascinating. Awesome. Okay, so if we dive into some of the meat of the podcast, one of the things that I’ve been seeing, you (Frédéric) speak on the circuit a lot. And, of course, thank you so much, because it’s so fascinating. One of the things that I’ve been noticing is that Google and Bing as well, especially, have been integrating more in with the search community, which is awesome to see, and one of things from the SEO perspective that I’m really interested in is – what can we, as SEOs, do to support Bing?

[00:02:40] Frédéric:
Yeah, that’s a good question in general, the reason why we want to interact
with the communities, that they keep us honest. In the sense that we know the
product we want to build. we think we know how it’s working and then You talk
like ten minutes with SEO’s, and they tell you exactly. No, no, no. This
technique you thought you eliminated it actually works Great. These kind of
things, eh? So for us, it’s really enlightening. So any feedback the community
has, it is definitely the best way to help us make a better product for users.

[00:03:12] Alexis: So
it’s almost like by going to these SEO conferences, or search conferences, You
guys were doing some product research?

[00:03:17] Frédéric:
Yeah, absolutely. And I like the product of a program manager role is very
focused on customers, understanding users. And we have, we’re in a very
interesting position at Bing, where we have two different set of customers, so
to speak. We have the final users who are actually using the product and
entering the search queries. But the Webmaster community as well and SEOs is
extremely important. Without webmasters, without people who write great content
for the Internet, there would be no point in having a search engine. So for us,
it’s extremely important to interact with both.

[00:03:52] Alexis: Definitely!
that’s awesome! And are there any specific tasks? I know that when you spoke
back at SMX East, you talked about how we could optimize our crawl efficiency
as something that is helpful to Bing and is really useful. Is there anything
else like that that you can think of that at the end of the day makes both our
websites better as well as Bing a better search engine?

[00:04:15] Frédéric:
Yeah and for a lot of people, it will be just making sure the basics are
working (in terms of crawl indexing). A lot of the technical SEO is not very
different for Google and Bing. But what a lot of people don’t realize is
sometimes they just allowed Google to crawl everything and Bing gets a disallow.

And if you are an SEO, webmaster, and you complain
that you feel you’re too dependent on Google to get your search traffic, but at
the same time, you’re blocking all the other crawlers, all the other search
engines from indexing your website… Well, you’re never going to get away from
that situation. And those people don’t realize these are like two very related
points, so make sure the basics are working for being in the same with our
working for Google is definitely number one thing for most people.

[00:05:02] Alexis: I
love that point. Think of Bing, remember Bing.

You obviously can’t see me right now, but I’m
wearing a Bing sweatshirt today, so really reppin’ Bing.

[00:05:12] Frédéric: I don’t.
(Lol) There’s only one person on this podcast wearing Bing swag.

[00:05:18] Max: That’s not true! I have a Merkle branded jacket that has a Bing logo on the shoulder.

[00:05:20] Frédéric: Nice, nice, there’s only one person
on this they’re not for wearing Bing.

[00:05:28] Alexis:
That’s awesome. So you reminded me of a tweet that you had recently where you
asked people in the search community what they’re interested in hearing whether
it was about JavaScript, machine learning or search history, right? And the top
one was JavaScript. Why do you think that this search community is so
fascinated with JavaScript?

[00:05:48] Frédéric:
Well, I think there’s a legitimate concern from the community that as their
websites are getting more, more complicated, the search engines are not going
to represent them (in their index) in the best way.

There is a lot of misunderstanding around what are
the best practices for JavaScript (or for other things we should do or the
things we shouldn’t do). Maybe in the search engine side, there was some
miscommunication in terms of “do we support JavaScript” or “we don’t support JavaScript”.
It’s much more nuanced than just saying, “Oh, yeah, of course we support JavaScript.”
Yep, in Bing we can claim with support JavaScript, in the sense that our crawler
is able to download those kind of resources. Render the pages for most
frameworks. But there is also a relative very process intensive, additionally
intensive process. So you have fairly little control in what search engines are
going to index on your site from JavaScript compared to a regular HTML. And I
understand that can cause some nervousness in the SEO community. So that’s
probably why there is a lot of questions and concerns.

[00:06:54] Alexis:
Yeah, so like a lot of anxiety combined with probably like you said, some
different formats of information. I know that one of the things that we’ve seen
is that for certain sites, JavaScript – totally fine. But then you’ll go.
You’ll have another site experience where they’ll switch over to a very JavaScript,
heavy experience and their traffic will suffer from it. So I think that lack of
consistency in terms of experience is so fascinating, and I think it’s
something that makes people really anxious because they don’t want to have that
type of trend in their performances as well. So thank you for sharing that. So
are you going to actually write a piece on that?

[00:07:26] Frédéric: I
think the piece will be on ML
and guidelines
actually. One of the reasons is Google came recently with
very good article. I think it was Martin Splitt who wrote
it
about RenderTron
and Techniques to make it easier for websites to be indexed when there is JavaScript.
I think there is a lot of literature that has already being written on
JavaScript. So I felt that even though, it was fairly clear, I think was forty
five forty one percent that close, close enough that I felt there was not
enough written around ML and guidelines, and that’s probably why I’m going to
write about all that.

[00:07:59] Max:
Yeah, I was about to say, That’s the good news about, like best practices for
JavaScript. Is that what you can do to make sure search engines can understand
your Content is to serve a prerendered or HTML  snapshot and it goes for both like Google and
Bing. once again like what you do for one search engine, it’s not different
than what you would do to optimize for another search engine, so optimization
of time.

[00:08:22] Alexis:
Optimization of time. I love it. And I love the idea of basically endorsing
content that Google has already done, saying this is fine, it works similar for
Bing, let’s focus on what we need to with machine learning. Which brings up the
talk that you had a Tech SEO boost, which was so fascinating as well. And I
loved your little quip where you said that Bing was the first to have to be
powered by a neural net, that’s so exciting and so interesting.

[00:08:45] Frédéric: Yeah, that’s a little known fact that that’s why Christi (Olsen) and I insisted we kind of hammer it like at the end of every conference now. We say it’s like Bing was the first one. Interestingly, these were like very rudimentary neural nets at the time. Like, our founder with deep learning. It wasn’t really deep in it in any way, because only one hidden layer and it was well, it was very simple. It shows that it’s something that’s been tough mind at Microsoft research and Bing search. Now at Bing for quite a while, we believe the best way to scale search is to use machine learning to make the machine learn about one of the best results to be returned for a query. And that’s why we have taken this approach that may be slightly different from what other major search engines do.

[00:09:30] Alexis: I
love that it’s like the most shallow, deep learning that you have. (lol) I’m
just kidding, of course.

[00:09:36] Frédéric:
That’s right. It’s just quite bits. That was thirteen years ago. So…

[00:09:41] Alexis: It
was deep for thirteen years ago, exactly. And you mentioned to they had, like a
ton of features over, like, five hundred features that were engineered into it.
Which, I mean, it’s one of the very challenging things that have probably been
custom done.

[00:09:54] Frédéric:
Yeah, and honestly, I don’t I don’t know exactly how they did it back in 2005.
But future engineering is definitely a big challenge and that that’s why a lot
of the discussions around ranking factors sounds a bit funny, especially for us
at Bing, because some of the features are like derivatives off like several
other features. You combine thing and it’s a very, our engineers take is that
it really is a machine learning problem, so they create new features that will
really make a lot of sense for humans. But have actually a great predictive
power for the for the model. And that’s where, like, this ranking factor thing,
like always comes in a bit odd for us.

[00:10:33] Alexis:
yeah, I loved how powerful your example of what a machine sees is so different
from what a person sees and in your example, You had used a stop sign where
basically, all you did they did was cover up a small part of it, and the
machine from that saw something totally different, which was a speed limit
sign. And I think the idea that machines process information differently than
humans process information is so interesting and so fascinating and probably
something that you have to deal with on a daily basis.

[00:11:00] Frédéric:
Yeah, then that starts, like with some of the worst cases where we see things
better, different from what you are seeing, like cloaking and this kind of
things that’s more like it. These are considered, like the cardinal sins of SEO
and search, because if the machine can’t even access the same thing as users like
lose all trust in everything you’re doing. But even like going back to JavaScript,
that’s also exactly what the problem is with JavaScript is not having the
guarantee that machine is going to see all the goodness you’re showing to users
when you have a JS heavy page. So I’m training all these features reading all
this knowledge can get complicated for sure.

[00:11:44] Alexis: Definitely.
So what do you think is the most important part of machine learning for search
professionals To understand? Because you and it is so many complicated elements
like Bing’s LambdaMART, vector space, (which I love, that I really hope the “it’s
the same in the vector space” catches on in the industry) and of course, RankNET.
What do you think is really important for people who are maybe less technical
or less well versed in mathematics to understand about what you’re trying to
achieve with machine learning?

[00:12:13] Frédéric: So
that that’s where I think the guidelines come rolling to play. If you look at
the process of machine learning it, it can get pretty complicated from a
technical point of view. If you’re not technical, that that sounds like a
foreign language. So what is really important to remember is it’s a way to
generalize search algorithms that is trained with how humans will be judging
the sites according to the guidelines. So the way we train our mission early
model, we have a subset of queries and URLs, and we send judges to these
websites, and we ask them to rate them according to the search quality
guidelines and that makes your training set. And we hold a little bit of this
data as, like validation and test set. And then that’s where you train your
machine learning algorithm. You want to go with them to perform really great on
this small subset of queries and URL’s that have been judged by humans. Then
you validate with other metrics that generalized pretty well to the 1,000x more
queries and URL’s we see. So in the end, thinking, with my site, according to
guidelines, get to perfect or excellent or good rating is probably a good way
to think about it.

[00:13:26] Alexis:
Nice. I love the idea of using humans as almost like he said, training all of
that data so that you can, then iterate on that process to make it more
efficient and better in the end, there’s like something very beautiful in that.
Hopefully, one day it’ll be all machines, right?

[00:13:40] Frédéric: I
don’t know if I would trust machines to the 100% of the work. For one, I like
my job. I don’t want a machine to take it. (lol) In the end, we build all of
this for people. So keeping people involved in the process, keeping the machine
honest. Looking whether the results make sense, not just that the metric looks
good. I don’t think it’s going away anytime soon.

[00:14:04] Max:
And Frederic, you talked about trust towards a website. I remember from
experience that Bing is pretty aggressive with, like spam or like a big red
flags about websites. A few years ago, I remember website that we were launching
and it was a dot info. And just for that reason, it could not be indexed right
away. Can you tell us more? But like maybe some, some big red flags that Bing,
as in the system that say “that website, most likely not a good one.”

[00:14:36] Frédéric:
So like in the same way that I don’t think there is any like silver bullet in
terms of good ranking factors, when you go outside of the worst offenders like
cloaking, I’m not sure there is anything where we would we ban outright a
website. I think what happens at Bing, compared to other search engines, is we
tend to see violations of a Webmaster Guidelines, as mostly voluntary.

I when I hear Fili Wiese (and I know he doesn’t
represent Google anymore), but he talks about this manual penalties and he
says, this is mostly education and if people fix their issues, (Google will)
remove the penalty and everything is great.

And on our side, we take probably what I more
punitive approach, where if you try to cheat the system, you’re going to have a
penalty that is going to last for a while because we don’t want you to cheat
the system again. And we’ve seen before if we remove the penalty. The sites
tends to just do the same things again so that That’s why I like when you say
we’re harsher on spam, I think the idea of spam is fairly similar. But the way
we approach it is a bit different. Maybe more of a punitive way to make sure
like people who live by the rules, actually are ranked higher in the results.

[00:15:45] Max:
That makes sense.

[00:15:47] Alexis: It brings back this idea. The
fact that you know your site is a relationship between your experience, your
users, and then also search engines as well. Because there’s almost like this
implicit trust that’s formed and you mention the word trust. Of course, I know
that with Google, this whole idea of expertise, authoritativeness, and
trustworthiness is becoming more and more important or popping up a little bit
more. But I think it’s so interesting and fascinating that you know you’re
using that as a standard, almost as if it’s an actual relationship.

[00:16:16] Frédéric:
Well, yeah, and it comes from the fact that our users really trust us to serve
the best and most authoritative results, and especially for queries is like the
tax season is picking up, and people want to make sure that they’re not giving
their social security number out, and all of their confidential information
out, scammers. And a lot of people will trust whatever comes at the first
positions on Bing. And If they click on the first link, like they cannot even
imagine, most of them actually cannot even imagine we will send them to a scam
or anything like that. So it is a huge responsibility for us. That’s why we
take it extremely seriously. Because when we fail it has real life consequences
for these people.

[00:17:00] Alexis: Definitely.
And do you think, Oh, I know that one of your articles that you mentioned, that
you’re thinking about running with the history of search, which I don’t
understand why it was the least popular? Because I feel like to hear from your
perspective of the history of search would be so incredibly fascinating,
because I felt that (and I don’t know if you’ve felt that this as well) that as
time has gone on, people have gravitated more towards that first result. Where
is in the past? I mean, when I was younger, just remember almost being told to
more critically evaluate all of the results that were coming through, and then
now it’s like, oh, just click on the first one. Whatever that says is fine,
which probably shifts more responsibility onto you as a search engine and your team.

[00:17:38] Frédéric: Yes,
and so there are two aspects to this from one sense, and it makes it slightly
easier because if it puts more weight on the number one number two number three
results, that means also like the weight of responsibilities is lower for in
return, things that are not necessarily the best results like number nine or
number ten. For some queries like if you type something like [Facebook log in],
there is an excellent number one result I can think of and not much more two,
three, four, five that I think would fulfill their user intent.

So to some extent it makes it a bit easier for this
category of very navigational like very explicit intent queries. But on the
other hand, you’re right, that it’s definitely changing. If you’ll get the best
twenty years, search is to be more of an information retrieval problem. So
really the idea of like, as you mean, this is a library of all the knowledge in
the world. “How can I find the ten best pieces of information or the ten best
books in the library” to match this this query and slowly, we’ve evolved
towards more like task completion, actual transactional intent, and also more
and more money got involved. And so that’s where you get spam and people are
between SEO and that that’s probably the main thing that changed like search.
Like the idea that you get a lot of people who love to be a number one and
we’re going to do whatever it takes to be a number one. It’s not just an
information control problem anymore. It’s becoming a really full-fledged products
where all of these dimensions relevance, quality, context fall into place.

[00:19:23] Alexis:
You know, when you have a lot of money on the line, I can imagine there’s a lot
of consequences that could happen. And, of course, we’ve heard recently about
so many different breakouts of data and data leakage issues, so super
fascinating. So thank you for sharing that with us. Okay. Do you’ve any
questions, Max?

[00:19:39] Max:
Yeah sure. Since maybe you in Seattle and I hear there’s a big e commerce
company in Seattle. If you can tell us, maybe it might be a little bit outside
off, like the internal, Bing system. But like what For you makes like a great,
like, eCommerce experience like features on the website that user expect maybe,
and that then, yes, maybe that Bing will reward without, giving away ranking
factors. That’s not really my question, but something that you guys are looking
for because users are looking for it.

[00:20:13] Frédéric: Yeah,
when it comes to the Bing eCommerce company in Seattle, it makes our life will
be easier because one of the one of the way we frame it here if you look at all
the eCommerce websites on the Internet, one question we asked ourselves is,
what would we give our credit card numbers to that website? And so when it
comes to our neighbor in Seattle, sure, like, I think anyone in the world so
confident that if they give them their credit card number, is going to be taken
care of with the greatest care and they’re not going to get unwanted charges.
And on the other hand, there are many websites on the Internet were like, never
ever I would even give like four digits of my credit card and when you look at
these sites, this is really the question you trust yourself. Like, Would I give
them my credit card number? It works in the user’s mind. It’s like, what are
the trust factors on this website? Does it look professional that have an
actual contact address that we can look up somewhere. I know that Google has in
their guidelines has the BBB rating, and I don’t think they use it at the
ranking factor or something. But the idea that someone else is vouching for you
is something that you need to take into account if you have an eCommerce
website from a trust point of view, all of these things are probably the number
one thing you want to make sure users are willing to do business with you, are
willing to give you their credit card number, And that’s what we’re looking for
at the  end, user satisfaction.

[00:21:38] Max: I
love that you said that, because just from a design standpoint, it’s today, it’s
with frameworks and built-in features and even would Bootstrap in orders like a
CSS and HTML like from works that you can find out there. It is pretty easy to
make scam looked really good and really professional. So I’m glad to hear that
it’s not just about that website can look good and be still a scam. And
hopefully we won’t see it popping up in any search results.

[00:22:06] Alexis:
Definitely. Just it sounds like it all comes back to trustworthiness. So kind
of really excited to hear that. Okay. All right. I’m going to go back to one of
your tweets. In your tweet, You mentioned that you review user feedback and
that you set aside a specific time to review that, which is really exciting
because I feel like I’ve really felt a lot of positive energy coming from the
Bing team in terms of almost doing a listening tour and trying to figure out
what’s going on in a space. And how can we then learn and react from that? So
how has your time that you’ve spent reviewing user feedback ever resulted in a
new project?

[00:22:39] Frédéric:
Yeah, that’s Ah, that’s if only super important to look at feedback. That’s a
personal belief I have that as products or product managers, it’s an essential
part of our job. And I don’t know if you can do good product manager work
without listening to your customers and users and partners. I can think of two
examples where it’s been extremely useful.

And one, it was a very visible feedback. If you
remember last spring, I think Yoast posted something about Bing crawling too
much. But they have a lot of data, probably from their plug-in, and they are
very well informed on these problems. And we took the feedback very seriously
and way heard before from other people. That Bing tends to crawl too much
compared to Google, and that’s something we definitely started to look at very
closely. And that’s what resulted last week. I believe in this new indexing a API
were announced at SMX West, as well as the integration with the Yoast Plug in
on My Yoast, which was announced at their conference last week.

So this is very concrete case where the feedback
we’ve been listening to, and we’ve been aggregating, compounded with someone
very visible and very vocal who forced the same feedback resulted in something
extremely concrete that we announced in the past couple of weeks.

Something that is a bit fuzzier probably is around
spam and all the all the times we are failing our users, so to speak. And I
take the feedback extremely seriously. And when I hear several different people
tell me, if I type a query for this domain, like the name of the drug or this
kind of things, and I really see bad results. This informs where we’re going to
invest our resources. And if I hear that a certain area is getting more and
more spam, or if some very technical people come to me and say, I notice that
this category of site, putting these key words in this way or whatever is
ranking higher than they used to. This is just all goodness.

So I invite all the listeners if you have any feedback
you want to give to us, you can tweet at me directly on Twitter. Or you can use
the feedback form on Bing on the upper right menu and we take it extremely
seriously.

[00:24:58] Alexis:
That’s awesome. It’s almost like keeping one ear to the ground just to make
sure that everything is going well, like a pulse, which is awesome. So thank
you for doing that.

[00:25:05] Frédéric:
Yeah, and in the end we do it for our users. So, like we have a lot of ways to
scale or understanding of user satisfaction with metrics and numbers. But
there’s nothing like qualitative feedback, like actual people. I have a
personal belief that if you talk to ten users and you listen to their actuals
verbatim feedback you learn so much more than just looking at a number, even if
the number of covers one billion users.

[00:25:35] Alexis:
It’s just so interesting to hear, though, that that qualitative feedback is so
valuable because I think a lot of times when we think about data, we think
about data and the massive amounts of information that, like even we receive on
the webmaster end. And I mean, I can only imagine how much you guys received on
your end. But we usually think about all quantitative, quantitative,
quantitative. But the value of qualitative data is so interesting and how it
can give you a totally different perspective. So thank you for sharing that
with us.

[00:26:02] Max: I’d
like to go back on the fact that there’s not a lot of differences. And what webmasters
technical SEO’s and the SEO can do to optimize for search engines like at least
Bing and Google. There was one that I can think in term of, you know, those
technical tags that we implement, and things that we do a hreflang tags for
international SEO and we all know that hreflang tags – they do work for Google,
well most of the time, but it could be extremely complicated, setup and
implementation really are to manage. Bing has not been on board with, like a
tag, can you tell us a little bit about like, how are you guys like, really
handle that? Not duplicate content, but violations, international violations
and how you detect like the targeted audience, basically for this website that
are multiple, like regional languages to target.

[00:26:55] Frédéric:
So I’m going to be a very disappointing answer. I’m not very familiar with the hreflang
tag treatment at Bing, so instead of giving an answer that I think would be
inaccurate. What I can tell you is if you have a Web sites like, let’s say, blah
dot com and in English blah dot fr in French. And if it is the same company and
like, we have some ways to detect that this is not duplicate content that this
is actually like two different language is the same thing if you have, like,
slash en and slash fr on given website. But in terms of hreflang, I just don’t
know, so sorry about that,

[00:27:43] Max:
Yeah, as well, as we know, like officially Bing does not support hreflang tag
again. That’s not something that I’m really surprised of because it’s a very
complex implementation. I even heard people at Google that have been working on
creating those tags that they’ll not extremely satisfied with the way it turned
out. That it turned out to be more complicated, that they wanted it to be.

[00:28:04] Frédéric:
Well, What I can tell you is it’s already complicated enough when you have only
one language and in two different websites, and you want to do just a simple
redirect from one to the other or simple economical. And sometimes when I look
at the presentations from other SEOs in in conferences. And they show this
super complicated graph for like, four websites, all canonicalizing to one
another with the hreflang in like multiple foreign languages like, it just
sounds like an extremely hard problem. So I’m not surprised that some people at
Google say it is hard, we don’t get it right all the time.

[00:28:41] Max:
Yeah, Sure.

[00:28:42] Alexis: That
reminded me just a concept of different words going back to the question of
vectors, you talked about in your tech SEO boost this idea that when you
associate words as vectors, it ends up being more efficient. Why vectors? And
I’m mostly curious because I’m in a class, and we literally just learned about
how to calculate the distance between two vectors. So I loved when you muttered
under your breath, you’re like you could just use the cosine of the angle. I
was like, uou totally can. (lol) It was like I can find you the formula for
that. But I was curious about what is it about vectors? And for people who are
less technical with math, vector is almost like just a direction or an arrow
with a line. So if you look at Frederic’s presentation, you can get that type
of visual or just Google word vectors. But why are word vectors so useful?

[00:29:28] Frédéric: so
so in ah, in summary, like the key concept here is embeddings, and the idea is
that you get, I don’t know, maybe one hundred thousand words like a million
words in the language, and you want to find similarities between the words. So
the way we do that is we convert these words into a series of numbers. And,
like, depending on what the exact implementation we have, it’s a one hundred
numbers that are going to represent what this word means. And you train your
model so that words that mean roughly the same thing or that are similar have
numbers are close to each other. And so that’s a nice way to essentially
compact the knowledge in your dictionary into a simple representation of one
hundred numbers. And so although all these numbers represent different
direction in them in the most dimensional space. So if you imagine, like the
real world of three dimensions, there, like three numbers, like left right like
that so to speak, in this world, it’s like one hundred different dimensions.
And so we tried to find the similarities, and in the end, you mentioned like we
measured the distance essentially between two different words. And so if you
have something like, let’s say apple and orange, these are like fairly
different objects. The words are completely different, but these are fruits, so
you the concepts are still, like relatively similar. So I expect these words to
be relatively close in the space. And the reason why it’s extremely useful for
search and SEO in general is it just gets you away from this idea that you need
to see synonyms or you need to make sure that you cover like ten different
variations of the same concept. The hope here is that the machine is going
turns than that. If you are, ah, fruit distributor, you don’t need like apples
pears distributor dot com, orange business dot com, pear business dot com. We
understand you’re a fruit distributor. All these things like makes sense to the
machine so that that’s why it’s extremely exciting for us as A development.

[00:31:51] Max: I
love that you say that I always used the superhero example like telling people
that, Yes, if you do want to rank like about Superman, then maybe good that
your website talk about, like Batman or Spiderman. And again, as you just
mentioned about the fruits, they’re all different words. But they are related
because there were, like, superhero name, and it will make the website worth
more relevant for a particular topic. And, something like that, I need to
expand to the context of what the topic is actually about.

[00:32:22] Alexis: Yeah,
and it’s almost like a lot of people in the industry I’ve noticed over the
last, probably two years have been talking about this idea of entity
optimization versus focusing on keywords but focusing on that overall, being
known for something, essentially.

[00:32:36] Frédéric:
Yeah, that’s very interesting. We will be working on entities for quite a while
at Bing, and there was a time like before and entities and vectors, and this
concept of similarities really caught on where this was a bit much more, kind
of handcrafted, so to speak. And so you would have liked this very strict
relationship or like an entity links to another with, like, for example,
Microsoft is a company. So next to the type was really a field in the entity is
“company”. And then “is CEO” is like Satya
(Nadella) and that would be
like a related person. And then the relationship and it feels like What is
relational? It is a manager and like and what? It’s kind of magical with these
vectors and entities is  – all of these
relationships come completely natural. You don’t need someone to tell you
exactly what is the relationship between Microsoft and Satya. And what is extremely interesting
to look at the literature and that that is probably one of the most fascinating
properties of these vectors, is if you, the distance between Microsoft and Satya
Nadella in the Vector Space
is the same as the distance between Google and Sundar Pichai.

[00:33:26] Alexis:
Weird…

[00:33:30] Frédéric: And
so, like you just drove, like essentially a triangle between Microsoft, Google
and Sundar Pichai Then you can extremely easily find that Satya has the same
relationship with Microsoft which is similar to their relationship with Google
and I find that it’s really fascinating, and that just makes sense, the
relationship so much more powerful because you can just learn them in the wild.
Instead of being to handcraft them over time.

[00:33:30]:
That’s so mind blowing. And when I’m visualizing this, I don’t know if anybody
has seen the graph of Word2Vec. But basically it sounds like exactly what
you’re talking about, which it’s probably stands for word to vector, but
basically it’s like that three dimensional graph of words. So you like, you’re
talking about You could almost see the clusters of information of things that
are, like, similar and related together as almost like a group of things that
air in one area. Just kind of cool to think about. But that’s actually it’s
even more mind blowing that like that relationship, the distance is exactly
similar. That’s crazy. Yeah, mind blown.

[00:34:52] Frédéric: I
think in their example, they used a man, woman, King, Queen, if I remember
correctly. And yeah, that’s exactly what I had in mind. So I think if you if
you’re if you’re in technical SEO reading the work to make paper or in general
like these foundational papers and word embeddings.

[00:35:11] Alexis:
that’s so brilliant and so fascinating, too. I really hope the you know it’s
pretty much the same in the vector space like no, totally different in the
space. I really do hope that catches on. I think it’s kind of like, interesting
to think about. I mean it basically, just if you were when you said something
like that, it inferred that like, it’s all about relevancy. But I just think
it’s kind of like another funny way to say that. I think that SEOs tend to find
funny ways to say things. Also, I do want to give you shout out – I thought it
was really cool that you mentioned Karen Jones in your speech. I know that she
recently passed away, but really cool to have women of science mentioned and
especially lauded for their accomplishments. So, thank you for that.

[00:35:52] Frédéric:
And she is really one of the most important persons in the field of information
retrieval, which is like the precursor to search (and SEO). And if you look at
her work, a lot of people talk about tf-idf. So she’s the mother of -idf. And
this specific part of the formula is actually one that it survived the time, so
if you look at for more advanced things like BM25. The tf- part has been
changed quite a bit. But the idea of the -idf is almost exactly the same. BM25 is
considered state of the art today for informational travel in in some sense. So
it’s quite incredible that her work, really, is still extremely relevant to the
field, like forty years after she wrote a paper on the idea.

[00:36:39] Alexis:
Isn’t that crazy? I think. Isn’t that, like every scientist’s dream that their
work out-lives them? So amazing.

[00:36:45] Frédéric:
Yeah, There is in a lot of conferences, They have these conferences there where
they call the “test of time” paper for and they look at all the papers that
were published ten years before. I think ten years is the canonical time. And
they give the award to whatever paper is still relevant or the most relevant at
the time.

[00:37:07] Alexis:
And I mean, obviously something that we want to encourage our scientists to do is
have relevant papers!  

All right, so for the closing question, Frédéric,
basically, I’ve been asking all of the other people that have joined the
podcast. What are their three golden nuggets of advice, which is essentially –
what should you do from an interpersonal level, a site-related level or really,
just a personal development level? Could be anything but just three pieces of
advice that you have for our listeners.

[00:37:35] Frédéric:
That’s ah, that’s a great open question. I would say that the number one is to
remember that you build things for people. We are an industry where we are
builders. We build websites, we build products, we build the search engine. We
all build these things for people. So this is the number one thing, my goal, as
a Bing product manager, is to make sure the product is going to be useful to
the people who use it and the consequence for you as webmaster or SEO is – it
is important that the content you build is going to be useful to my users,
because I mean the intermediary between my user and you. So I want I want to be
able to vouch for you and say, “Yes, I think this is a great result, and I happily
send by users to you.” So that’s definitely like the number one from, Ah, more
technical point of view. On. I’m just going to reuse what I said a few years
ago.

[00:38:32] Alexis: I
totally feel for you. You can totally reuse whatever.

[00:38:35] Frédéric:
And definitely start looking at embeddings and similarities and how modern NLP
is done with deep learning. If you have a little bit of time, take the Coursera
from Andrew Ng. There’s a machine learning 101. And there’s the deep learnings
specialization, which is a set of five different courses. I found it easy to
take deep learning, even without the machine learning knowledge. I just happen
to do the machine learning before, but this is a great course. You don’t need a
lot of technical math background, and he’s going to give you a lot of the
understanding around deep learning. So that’s That would be my advice. Like if
you if you can blow a little bit of time over the next few months to take this specialization
or even if it’s not on Coursera just like, learn more about these things, deep
learning and how it’s using NLP that that is the future. That is really the
future. You get an edge just by learning about these things.

[00:39:37] Alexis:
Yeah, I love that. Andrew is definitely the man, too. So…

[00:39:40] Frédéric:
absolutely he worked with biggest companies. Like just not Microsoft. We need
to hire him at some point, just so he can  have the Grand Slam Big companoes.

[00:39:52] Alexis: Yeah,
when you look up his resume, he was very high up in Google. Then he worked at Baidu.
So. Yeah. You guys totally need to hire him at Bing So he has, like everything.

[00:40:02] Frédéric: Exactly
when, when your lowest achievement is being a professor at Stanford like that,
just speaks like…

[00:40:08] Alexis: That’s so
true, but yes he’s actually his class on machine learning is also very good on Coursera.
And then I think it’s a little bit better than the one that’s on iTunes
University because that one’s basically the older class, but specifically for
Stanford. Yeah, great, great point. I’ll definitely I’m going to check out that
deep learning course too now.

[00:40:29] Frédéric:
Yeah, the machine learning one is definitely a bit more technical. I think, and
especially he had two versions, one on Coursera, and he had the one on the
Stanford website. But they think the iTunes one, that’s really the one he had
on Stanford. The one on the Stanford websites – it was really assumed that you
basically followed all the classes at Stanford before. And so we have, a lot of
knowledge in algebra and like a lot of things like that.

[00:40:58] Alexis:
yeah, if you don’t know what partial derivatives are, it’s very discouraging.
(lol)

[00:41:01] Frédéric:
Exactly. He forces you to compute them. (lol) Whereas the deep learning one,
you don’t need that technical background. And a lot of times he says that he
actually say, if you know about these partial derivatives and everything – Great
– here is like some reading for you. If you don’t know about it, just forget
about this. Understanding concept Is somewhat more important than being able to
complete it partial derivative.

[00:41:25] Alexis:
Yeah, I love that idea of having to understand intuition of what you’re
actually trying to achieve in math. I feel like that’s something that’s
underappreciated art, which I thought you did a great job in your tech boost
talk as well, like saying like, “Well, here’s the intuition of it, you know?”

[00:41:40] Frédéric:
Well, I guess I tried to channel my Andrew at Tech SEO boost, because he does
that a lot. I think these videos and then when we talk about one hundred dimension
vector spaces, it’s gonna be extremely hard to visualize or understand what it
is. And so a lot of time in his videos is going to explain the intuition behind
it. And like, why we do this a certain way. And that’s why it’s just a great,
great series, of course, and not just like good deep learning class, It is just
like I think the reference at this
point to learn more about deep learning.

And I will use my my third key points, maybe to do a
little bit of upselling for Bing. In the sense that, we released this crawling,
indexing API very recently. There was an integration with Yoast, but you don’t
need to use Yoast.

If you have a website that is running on any
platform, you can still go ahead – register with Bing Webmaster Tools and start
using the API or even just submitting your URLs directly there. And for most
websites, if you do that, you should like differently great improvements in
terms of the crawling and indexing. So that would be really my top
recommendation. If you feel you’re old Index isn’t crawling properly with Bing
start with Bing webmaster tools, submit some URLs and that should solve most of
the problems

[00:43:05] Max:
Are you saying that we should not use XML sitemaps anymore? (lol)

[00:43:10] Frédéric: XML
sitemaps are good, but they are just a least of all – like the way we see it – it’s
the least of all the URLs on your website. And if have a million of them, maybe
you care a lot about ten thousand of them, not the entire one million.

And that’s great, because you can submit ten
thousand URLs to the submit URLs feature on webmaster tools. And these are the
ones we’re going to prioritize. So we can discover all million from your
sitemap. But instead of letting us decide which ones are more important, we
just prefer you telling us which ones are important.

[00:43:40] Max:
That is amazing. Thank you, guys, for putting that together and making it
available.

[00:43:44] Alexis:
Yeah. Congratulations! That’s so exciting and exciting for us as well in the
search community, so thanks! Well, I know you have a tight time schedule, but
thank you so much for coming on our show and for educating us all on Bing and
the history. And of course, some of the more technical knowledge as well, very,
very exciting and very honored to have you on the podcast. We’ll definitely
have to check out some of the more technical things, like embeddings as well as
deep learning. So thank you for that as well! It’s been an honor!

[00:44:12] Frédéric:
Thanks for having me!

[00:44:13] Alexis: Alright. Thanks, ciao everyone!


The post 2. Interview w/ Frédéric Dubut, Bing appeared first on TechnicalSEO.com.

Twitter Mentions