 | | | | |  | | | | | Guest | Search engine... engine -
06-04-2007, 01:36 AM
Hi,
For a potential future project I might have to implement a site wide search
engine. I don't have the full details yet, but I'm told the site will most
likely have an agenda, rss feeds and other items that can be brought down to
plain text columns in a database (so no independent files for instance, e.g.
pdf's). Come to think of it, perhaps some sample movieclips, but let's keep
that one out of the equation for now.
As I don't have any experience with implementing search engines in a site, I
was wondering... does the core of a textual search engine basically boil
down to something a simple as SQL queries like these:
SELECT some_column(s) FROM foo WHERE
< LIKE %bar >
< LIKE %bar% >
< LIKE bar% >
or am I missing some very important essentials here?
If it's of any importance: I develop websites using PHP, MySQL on Apache.
Thank you in advance for any tips and/or guidance. | | | | | | | | Guest | Re: Search engine... engine -
06-04-2007, 01:36 AM
amygdala wrote:
> Hi,
>
> For a potential future project I might have to implement a site wide search
> engine. I don't have the full details yet, but I'm told the site will most
> likely have an agenda, rss feeds and other items that can be brought down to
> plain text columns in a database (so no independent files for instance, e.g.
> pdf's). Come to think of it, perhaps some sample movieclips, but let's keep
> that one out of the equation for now.
>
> As I don't have any experience with implementing search engines in a site, I
> was wondering... does the core of a textual search engine basically boil
> down to something a simple as SQL queries like these:
>
> SELECT some_column(s) FROM foo WHERE
> < LIKE %bar >
> < LIKE %bar% >
> < LIKE bar% >
>
> or am I missing some very important essentials here?
>
> If it's of any importance: I develop websites using PHP, MySQL on Apache.
>
> Thank you in advance for any tips and/or guidance.
>
>
Why not just use Google's site search? It's much easier to implement!
--
==================
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp. EMAIL REMOVED
================== | | | | | | | | Guest | Re: Search engine... engine -
06-04-2007, 01:36 AM
"Jerry Stuckle" <EMAIL REMOVED> schreef in bericht
news:EMAIL REMOVED. ..
> amygdala wrote:
>> Hi,
>>
>> For a potential future project I might have to implement a site wide
>> search engine. I don't have the full details yet, but I'm told the site
>> will most likely have an agenda, rss feeds and other items that can be
>> brought down to plain text columns in a database (so no independent files
>> for instance, e.g. pdf's). Come to think of it, perhaps some sample
>> movieclips, but let's keep that one out of the equation for now.
>>
>> As I don't have any experience with implementing search engines in a
>> site, I was wondering... does the core of a textual search engine
>> basically boil down to something a simple as SQL queries like these:
>>
>> SELECT some_column(s) FROM foo WHERE
>> < LIKE %bar >
>> < LIKE %bar% >
>> < LIKE bar% >
>>
>> or am I missing some very important essentials here?
>>
>> If it's of any importance: I develop websites using PHP, MySQL on Apache.
>>
>> Thank you in advance for any tips and/or guidance.
>
> Why not just use Google's site search? It's much easier to implement!
>
Hi Jerry,
Thanks for the response. I thought about that too, but I don't think I can
get away with that, nor do I want to get away with that. I'll tell you why:
What is of very big importance here is that visual design is key for this
site (as for most of the projects I work on). So I need to have maximum
control over the visual output of results (I'm talking pixelf*cking here, as
some of us like to call this in The Netherlands, but perhaps this term is
known abroad too ;-)), as well as have control over the information of the
search result. So, and correct me if I'm wrong, I have a feeling that the
output of google results aren't that easy to tweak on a visual level, nor is
it easy to tweak on a informational level... is it?
Also, but this could very well also be my lack of knowledge about
integrating google search in a site, I have a feeling that you can't get
away with integrating google search without a notice of it being provided by
google, right? If I'm right, that will be a big no-no too.
Moreover, some searchable content of the site could very well end up in an
authenticated domain of the site.
And last but not least, I simply like to have some knowledge of how
searchengines work, when I decide to implement one. So I have the knowledge,
and thus control over what *exactly* is indexed. I hope you catch my drift
here.
So the question of my initial message remains. Any more input is much
appreciated. Thanks | | | | | | | | Guest | Re: Search engine... engine -
06-04-2007, 01:36 AM
amygdala wrote:
> "Jerry Stuckle" <EMAIL REMOVED> schreef in bericht
> news:EMAIL REMOVED. ..
>> amygdala wrote:
>>> Hi,
>>>
>>> For a potential future project I might have to implement a site wide
>>> search engine. I don't have the full details yet, but I'm told the site
>>> will most likely have an agenda, rss feeds and other items that can be
>>> brought down to plain text columns in a database (so no independent files
>>> for instance, e.g. pdf's). Come to think of it, perhaps some sample
>>> movieclips, but let's keep that one out of the equation for now.
>>>
>>> As I don't have any experience with implementing search engines in a
>>> site, I was wondering... does the core of a textual search engine
>>> basically boil down to something a simple as SQL queries like these:
>>>
>>> SELECT some_column(s) FROM foo WHERE
>>> < LIKE %bar >
>>> < LIKE %bar% >
>>> < LIKE bar% >
>>>
>>> or am I missing some very important essentials here?
>>>
>>> If it's of any importance: I develop websites using PHP, MySQL on Apache.
>>>
>>> Thank you in advance for any tips and/or guidance.
>> Why not just use Google's site search? It's much easier to implement!
>>
>
> Hi Jerry,
>
> Thanks for the response. I thought about that too, but I don't think I can
> get away with that, nor do I want to get away with that. I'll tell you why:
>
> What is of very big importance here is that visual design is key for this
> site (as for most of the projects I work on). So I need to have maximum
> control over the visual output of results (I'm talking pixelf*cking here, as
> some of us like to call this in The Netherlands, but perhaps this term is
> known abroad too ;-)), as well as have control over the information of the
> search result. So, and correct me if I'm wrong, I have a feeling that the
> output of google results aren't that easy to tweak on a visual level, nor is
> it easy to tweak on a informational level... is it?
>
> Also, but this could very well also be my lack of knowledge about
> integrating google search in a site, I have a feeling that you can't get
> away with integrating google search without a notice of it being provided by
> google, right? If I'm right, that will be a big no-no too.
>
> Moreover, some searchable content of the site could very well end up in an
> authenticated domain of the site.
>
> And last but not least, I simply like to have some knowledge of how
> searchengines work, when I decide to implement one. So I have the knowledge,
> and thus control over what *exactly* is indexed. I hope you catch my drift
> here.
>
> So the question of my initial message remains. Any more input is much
> appreciated. Thanks
>
>
OK, but it's a lot more complicated than you think. For instance, you
need to search the content - but not the programming (php, perl, etc.)
on the page. You need to be able to look at the generated content from
those programs.
Basically, you need to search the generated pages - not the source code.
Of course, if your site is 100% static, then there is no code to
search, and you won't have a problem. But few sites are that way - and
yours won't be as soon as you start adding search engine code.
It's not a simple job, and if you've never done anything with search
engines before, it's no minor thing to write.
As for maximum visual control - every time I hear someone say that over
here, I tell them to just generate PDF's. If they want a web site, then
they should be willing to create dynamic layouts.
--
==================
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp. EMAIL REMOVED
================== | | | | | | | | Guest | Re: Search engine... engine -
06-04-2007, 01:36 AM
>> Hi Jerry,
>>
>> Thanks for the response. I thought about that too, but I don't think I
>> can get away with that, nor do I want to get away with that. I'll tell
>> you why:
>>
>> What is of very big importance here is that visual design is key for this
>> site (as for most of the projects I work on). So I need to have maximum
>> control over the visual output of results (I'm talking pixelf*cking here,
>> as some of us like to call this in The Netherlands, but perhaps this term
>> is known abroad too ;-)), as well as have control over the information of
>> the search result. So, and correct me if I'm wrong, I have a feeling that
>> the output of google results aren't that easy to tweak on a visual level,
>> nor is it easy to tweak on a informational level... is it?
>>
>> Also, but this could very well also be my lack of knowledge about
>> integrating google search in a site, I have a feeling that you can't get
>> away with integrating google search without a notice of it being provided
>> by google, right? If I'm right, that will be a big no-no too.
>>
>> Moreover, some searchable content of the site could very well end up in
>> an authenticated domain of the site.
>>
>> And last but not least, I simply like to have some knowledge of how
>> searchengines work, when I decide to implement one. So I have the
>> knowledge, and thus control over what *exactly* is indexed. I hope you
>> catch my drift here.
>>
>> So the question of my initial message remains. Any more input is much
>> appreciated. Thanks
>
> OK, but it's a lot more complicated than you think. For instance, you
> need to search the content - but not the programming (php, perl, etc.) on
> the page. You need to be able to look at the generated content from those
> programs.
>
> Basically, you need to search the generated pages - not the source code.
> Of course, if your site is 100% static, then there is no code to search,
> and you won't have a problem. But few sites are that way - and yours
> won't be as soon as you start adding search engine code.
Hmm, I didn't think about this approach yet, because I was focussed on the
idea that the site will predominantly be dynamic, and thus have most of it's
content as plain text in VARCHAR or TEXT datatype columns. Even the agenda
and rss feed content. Perhaps at most with some basic html tags here and
there in these fields. So it would not have crossed my mind that I might
have to search through sourcecode. But your perspective made me realize that
I should also take into consideration content that will not be stored in de
DB (if any). Good point.
But let's say, for the sake of argument (am I trying to persuide myself
here? ;-)), that all the sites (important) content will be available in the
DB as plain text and some html tags here and there. Would the core of what I
tried to describe hold up? Or is using sql statements with merely LIKE
clause a novice's approach? Let me state though that I, on occasion, tend to
have the idea that things are more complicated than they really are. So, it
wouldn't surprise me that this indeed *is* a clause even a company as Google
is using in there main SQL queries. But then again, being unexperienced, I
am usually not too sure of these ***umptions.
BTW what popped up in my mind in the meantime is that I would probably have
to index database columns too. I imagine that I would also need to come up
with a good approach for tweaking these, right? I'm thinking: the size of
the indices to be most efficient concerning speed, the disk space to store
the indices on, etc.
> It's not a simple job, and if you've never done anything with search
> engines before, it's no minor thing to write.
I was afraid of an answer like that. :-/ On the other hand; the majority of
dynamic site projects I've been involved in turned out to be *not that
simple* in hindsight ;-). But then, that is why I am now turning to this
newsgroup to get some feedback on the matter beforehand, to try and
eliminate too big surprises, and get a well balanced view on the matter. And
if I feel that there will be too many pittfalls for me, with my limited
experience, I will probably reject the ***ignment.
> As for maximum visual control - every time I hear someone say that over
> here, I tell them to just generate PDF's. If they want a web site, then
> they should be willing to create dynamic layouts.
I usually don't develop sites with 'fluid design', but if the
designcompany's I work with *did* provide fluid design layouts, they would
still demand pixel-, or rather percentage perfect results. With margins,
leading, relative fontsizes, and the likes being in sync with their design.
But that's a whole other discussion.
What it comes down to is that the search engine interface would still have
to be in sync with the overall look and feel of the site.
Thanks again Jerry. Much appreciated.
> --
> ==================
> Remove the "x" from my email address
> Jerry Stuckle
> JDS Computer Training Corp.
> EMAIL REMOVED
> ================== | | | | | | | | Guest | Re: Search engine... engine -
06-04-2007, 01:36 AM
amygdala wrote:
>>> Hi Jerry,
>>>
>>> Thanks for the response. I thought about that too, but I don't think I
>>> can get away with that, nor do I want to get away with that. I'll tell
>>> you why:
>>>
>>> What is of very big importance here is that visual design is key for this
>>> site (as for most of the projects I work on). So I need to have maximum
>>> control over the visual output of results (I'm talking pixelf*cking here,
>>> as some of us like to call this in The Netherlands, but perhaps this term
>>> is known abroad too ;-)), as well as have control over the information of
>>> the search result. So, and correct me if I'm wrong, I have a feeling that
>>> the output of google results aren't that easy to tweak on a visual level,
>>> nor is it easy to tweak on a informational level... is it?
>>>
>>> Also, but this could very well also be my lack of knowledge about
>>> integrating google search in a site, I have a feeling that you can't get
>>> away with integrating google search without a notice of it being provided
>>> by google, right? If I'm right, that will be a big no-no too.
>>>
>>> Moreover, some searchable content of the site could very well end up in
>>> an authenticated domain of the site.
>>>
>>> And last but not least, I simply like to have some knowledge of how
>>> searchengines work, when I decide to implement one. So I have the
>>> knowledge, and thus control over what *exactly* is indexed. I hope you
>>> catch my drift here.
>>>
>>> So the question of my initial message remains. Any more input is much
>>> appreciated. Thanks
>> OK, but it's a lot more complicated than you think. For instance, you
>> need to search the content - but not the programming (php, perl, etc.) on
>> the page. You need to be able to look at the generated content from those
>> programs.
>>
>> Basically, you need to search the generated pages - not the source code.
>> Of course, if your site is 100% static, then there is no code to search,
>> and you won't have a problem. But few sites are that way - and yours
>> won't be as soon as you start adding search engine code.
>
> Hmm, I didn't think about this approach yet, because I was focussed on the
> idea that the site will predominantly be dynamic, and thus have most of it's
> content as plain text in VARCHAR or TEXT datatype columns. Even the agenda
> and rss feed content. Perhaps at most with some basic html tags here and
> there in these fields. So it would not have crossed my mind that I might
> have to search through sourcecode. But your perspective made me realize that
> I should also take into consideration content that will not be stored in de
> DB (if any). Good point.
No, the point here is - do you also store code in the database? Often
times you need it - for instance, one I'm working on right now for a
non-profit uses PHP in the database to list members of chapters. And
you need the generated member list, not the source code, even though the
source code is being stored in the database.
> But let's say, for the sake of argument (am I trying to persuide myself
> here? ;-)), that all the sites (important) content will be available in the
> DB as plain text and some html tags here and there. Would the core of what I
> tried to describe hold up? Or is using sql statements with merely LIKE
> clause a novice's approach? Let me state though that I, on occasion, tend to
> have the idea that things are more complicated than they really are. So, it
> wouldn't surprise me that this indeed *is* a clause even a company as Google
> is using in there main SQL queries. But then again, being unexperienced, I
> am usually not too sure of these ***umptions.
>
Define "important content" - especially as in light of the above?
> BTW what popped up in my mind in the meantime is that I would probably have
> to index database columns too. I imagine that I would also need to come up
> with a good approach for tweaking these, right? I'm thinking: the size of
> the indices to be most efficient concerning speed, the disk space to store
> the indices on, etc.
>
And the limit to index size. But if you're searching for LIKE %foo%,
then indicies won't be used anyway.
>> It's not a simple job, and if you've never done anything with search
>> engines before, it's no minor thing to write.
>
> I was afraid of an answer like that. :-/ On the other hand; the majority of
> dynamic site projects I've been involved in turned out to be *not that
> simple* in hindsight ;-). But then, that is why I am now turning to this
> newsgroup to get some feedback on the matter beforehand, to try and
> eliminate too big surprises, and get a well balanced view on the matter. And
> if I feel that there will be too many pittfalls for me, with my limited
> experience, I will probably reject the ***ignment.
>
Depends on your definition of "not that simple". The majority of the
dynamic sites I've done are relative simple. There may be a fair amount
of coding involved, but all if it is straightforward. A site search
engine is anything but simple or straightforward
>> As for maximum visual control - every time I hear someone say that over
>> here, I tell them to just generate PDF's. If they want a web site, then
>> they should be willing to create dynamic layouts.
>
> I usually don't develop sites with 'fluid design', but if the
> designcompany's I work with *did* provide fluid design layouts, they would
> still demand pixel-, or rather percentage perfect results. With margins,
> leading, relative fontsizes, and the likes being in sync with their design.
> But that's a whole other discussion.
>
All my sites have fluid design. It's one of the things which sets a
true webmaster above a wanna-be, IMHO. Relative font sizes, etc. are
fine. But a design must adjust itself to different default font sizes,
window sizes, etc.
> What it comes down to is that the search engine interface would still have
> to be in sync with the overall look and feel of the site.
>
> Thanks again Jerry. Much appreciated.
>
I understand this. But in general I would explain to them they they can
get a Google search for $5 - about what it costs me to insert the code
to do the Google search and tailor it (5 min. max).
Or they can at least $2K-5K (maybe more - depending on the site) for a
custom search engine.
Which option do you think everyone takes?
--
==================
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp. EMAIL REMOVED
================== | | | | | | | | Guest | Re: Search engine... engine -
06-04-2007, 01:36 AM
>> Hmm, I didn't think about this approach yet, because I was focussed on
>> the idea that the site will predominantly be dynamic, and thus have most
>> of it's content as plain text in VARCHAR or TEXT datatype columns. Even
>> the agenda and rss feed content. Perhaps at most with some basic html
>> tags here and there in these fields. So it would not have crossed my mind
>> that I might have to search through sourcecode. But your perspective made
>> me realize that I should also take into consideration content that will
>> not be stored in de DB (if any). Good point.
>
> No, the point here is - do you also store code in the database?
Like I tried to point out, apart from the occasional HTML tag, no, no PHP
code.
> Often times you need it - for instance, one I'm working on right now for a
> non-profit uses PHP in the database to list members of chapters. And you
> need the generated member list, not the source code, even though the
> source code is being stored in the database.
Are you implying here that you store complete instances of PHP cl***es in
the DB? If, so... interesting approach. Again, one I wouldn't have come up
with myself very quickly. BTW: serializing the objects by any chance?
> Define "important content" - especially as in light of the above?
Well, for instance, articles, rss feed content, agenda items. But excluding
contactform information, for instance, or captions of site elements, or an
occasional quote perhaps.
>> BTW what popped up in my mind in the meantime is that I would probably
>> have to index database columns too. I imagine that I would also need to
>> come up with a good approach for tweaking these, right? I'm thinking: the
>> size of the indices to be most efficient concerning speed, the disk space
>> to store the indices on, etc.
>>
>
> And the limit to index size. But if you're searching for LIKE %foo%, then
> indicies won't be used anyway.
Yeah, I think I that was what I tried to say when I said: the size of the
indices.
> Depends on your definition of "not that simple". The majority of the
> dynamic sites I've done are relative simple. There may be a fair amount
> of coding involved, but all if it is straightforward. A site search
> engine is anything but simple or straightforward
Well that's probably because each project I'm involved in I still have to
learn a great deal, since I'm pretty much a novice. But the last project I
did was a little custom CMS which proved to be complex, eventhough it was a
pretty straightforward CMS. No usermanagement, no fancy permission settings
and stuff. Just very basic. But it included my first time image management
and rich text editing, dynamically creating new pages, etc. Pretty tuff the
first time around IMHO.
> All my sites have fluid design. It's one of the things which sets a true
> webmaster above a wanna-be, IMHO. Relative font sizes, etc. are fine.
> But a design must adjust itself to different default font sizes, window
> sizes, etc.
I don't know about that Jerry, that's a rather bold statement IMO. I can see
where you're coming from and I think I know why you think fluid design is
important. But to tell you the truth, I don't care too much for fluid
design. And the design company's I've worked for here in The Netherlands
don't either. And I don't blame them, since fonts that resize compared to
static sized images usually messes up the design of a site bigtime. It
easily get's disproportional.
I know this argument doesn't hold up, since you of course will say that it
isn't done right then, but I can't remember a fluid design site which design
didn't totally mess up once you started playing with font size and stuff.
Maybe the importance of fluid design even differs per country's design
culture, I don't know. I've worked for a few major designcompany's here in
The Netherlands, and the majority of the sites I was ***igned to do where
fixed width sites of about 760 px. Very often center aligned. And they
insisted on fixed font sizes... definately no tempering with font sizes.
>> What it comes down to is that the search engine interface would still
>> have to be in sync with the overall look and feel of the site.
>>
>> Thanks again Jerry. Much appreciated.
>>
> I understand this. But in general I would explain to them they they can
> get a Google search for $5 - about what it costs me to insert the code to
> do the Google search and tailor it (5 min. max).
>
So, I take it you have experience with integrating google search? If so, and
if you don't mind elaborating, to what degree is the interface, including
the results customizable?
Thank you in advance. | | | | | | | | Guest | Re: Search engine... engine -
06-04-2007, 01:36 AM
amygdala wrote:
>>> Hmm, I didn't think about this approach yet, because I was focussed on
>>> the idea that the site will predominantly be dynamic, and thus have most
>>> of it's content as plain text in VARCHAR or TEXT datatype columns. Even
>>> the agenda and rss feed content. Perhaps at most with some basic html
>>> tags here and there in these fields. So it would not have crossed my mind
>>> that I might have to search through sourcecode. But your perspective made
>>> me realize that I should also take into consideration content that will
>>> not be stored in de DB (if any). Good point.
>> No, the point here is - do you also store code in the database?
>
> Like I tried to point out, apart from the occasional HTML tag, no, no PHP
> code.
>
And where are you going to put your search engine code, for instance?
Or are you going to have something like a catalog?
I guess my biggest question is - if this is all going to be static
pages, why are you storing the data in a database in the first place?
>> Often times you need it - for instance, one I'm working on right now for a
>> non-profit uses PHP in the database to list members of chapters. And you
>> need the generated member list, not the source code, even though the
>> source code is being stored in the database.
>
> Are you implying here that you store complete instances of PHP cl***es in
> the DB? If, so... interesting approach. Again, one I wouldn't have come up
> with myself very quickly. BTW: serializing the objects by any chance?
>
Not necessarily entire cl***es. But code specific to those pages often
is stored in the database in many CMS's.
>> Define "important content" - especially as in light of the above?
>
> Well, for instance, articles, rss feed content, agenda items. But excluding
> contactform information, for instance, or captions of site elements, or an
> occasional quote perhaps.
>
Hmm, agenda items probably are going to be in their own tables with code
to access those tables, aren't they? Maybe the same with contact info,
quotes, etc.?
>>> BTW what popped up in my mind in the meantime is that I would probably
>>> have to index database columns too. I imagine that I would also need to
>>> come up with a good approach for tweaking these, right? I'm thinking: the
>>> size of the indices to be most efficient concerning speed, the disk space
>>> to store the indices on, etc.
>>>
>> And the limit to index size. But if you're searching for LIKE %foo%, then
>> indicies won't be used anyway.
>
> Yeah, I think I that was what I tried to say when I said: the size of the
> indices.
>
You can't build an index an an entire text field, anyway. Most limit
index sizes, often to 256 bytes. And even then, they can only use the
index when the search starts a word/phrase. For instance, you can
search on "you are" in "You are here", but not in "Here you are".
Indexes are of very limited use in such a case.
To get around this, search engines often build tables of words and the
pages they are in.
>> Depends on your definition of "not that simple". The majority of the
>> dynamic sites I've done are relative simple. There may be a fair amount
>> of coding involved, but all if it is straightforward. A site search
>> engine is anything but simple or straightforward
>
> Well that's probably because each project I'm involved in I still have to
> learn a great deal, since I'm pretty much a novice. But the last project I
> did was a little custom CMS which proved to be complex, eventhough it was a
> pretty straightforward CMS. No usermanagement, no fancy permission settings
> and stuff. Just very basic. But it included my first time image management
> and rich text editing, dynamically creating new pages, etc. Pretty tuff the
> first time around IMHO.
>
Not bad.
>> All my sites have fluid design. It's one of the things which sets a true
>> webmaster above a wanna-be, IMHO. Relative font sizes, etc. are fine.
>> But a design must adjust itself to different default font sizes, window
>> sizes, etc.
>
> I don't know about that Jerry, that's a rather bold statement IMO. I can see
> where you're coming from and I think I know why you think fluid design is
> important. But to tell you the truth, I don't care too much for fluid
> design. And the design company's I've worked for here in The Netherlands
> don't either. And I don't blame them, since fonts that resize compared to
> static sized images usually messes up the design of a site bigtime. It
> easily get's disproportional.
>
IMHO, that's a big difference between a graphics designer with some
programming experience and a true webmaster. Graphics designers
typically want pages to look exactly the same no matter what. Fine, you
can do it - that's what PDF's are for. But real webmasters understand
the fluid nature of the web, and make it work. Real webmasters can
design pages where the can be resized without messing up the design.
> I know this argument doesn't hold up, since you of course will say that it
> isn't done right then, but I can't remember a fluid design site which design
> didn't totally mess up once you started playing with font size and stuff.
>
You haven't looked around much, then. Of course there are limits, but
they can be quite wide. Check out www.durpal.org, for instance. Or, www.sourceforge.net. Or any of millions of other sites on the net.
> Maybe the importance of fluid design even differs per country's design
> culture, I don't know. I've worked for a few major designcompany's here in
> The Netherlands, and the majority of the sites I was ***igned to do where
> fixed width sites of about 760 px. Very often center aligned. And they
> insisted on fixed font sizes... definately no tempering with font sizes.
>
Not that I've found. But as i said, that's the difference between
graphic designers and webmasters. There are a lot of wanna-be
webmasters who think they know what they're doing out there.
>
>>> What it comes down to is that the search engine interface would still
>>> have to be in sync with the overall look and feel of the site.
>>>
>>> Thanks again Jerry. Much appreciated.
>>>
>> I understand this. But in general I would explain to them they they can
>> get a Google search for $5 - about what it costs me to insert the code to
>> do the Google search and tailor it (5 min. max).
>>
>
> So, I take it you have experience with integrating google search? If so, and
> if you don't mind elaborating, to what degree is the interface, including
> the results customizable?
>
> Thank you in advance.
>
>
I don't customize it - because my customers don't care if people see
it's Google search. People are use to Google search anyway. And if it
saves them hundreds or thousands of dollars, they quite happily show the
Google results. In fact, they take pride in having it integrated into
their site - even though it's only a few lines of HTML to implement.
--
==================
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp. EMAIL REMOVED
================== | | | | | | | | Guest | Re: Search engine... engine -
06-04-2007, 01:37 AM
>> Like I tried to point out, apart from the occasional HTML tag, no, no PHP
>> code.
>>
> And where are you going to put your search engine code, for instance? Or
> are you going to have something like a catalog?
>
> I guess my biggest question is - if this is all going to be static pages,
> why are you storing the data in a database in the first place?
What makes you think it is going to be all static pages? I guess you have a
whole different approach of how you build dynamic pages. I build them as
sort of templates. So, these templates are stored on disk in the
filesysytem, as regular php pages. And the dynamic part (I'm talking about
the actual 'content' that is inserted in the DB by a CMS user here) is
inserted in these templates through querying the DB. I don't see how this
should involve PHP code being in the DB.
>>> Often times you need it - for instance, one I'm working on right now for
>>> a non-profit uses PHP in the database to list members of chapters. And
>>> you need the generated member list, not the source code, even though the
>>> source code is being stored in the database.
>>
>> Are you implying here that you store complete instances of PHP cl***es in
>> the DB? If, so... interesting approach. Again, one I wouldn't have come
>> up with myself very quickly. BTW: serializing the objects by any chance?
>>
>
> Not necessarily entire cl***es. But code specific to those pages often is
> stored in the database in many CMS's.
I haven't had a close look at other CMS'es too much, but that just isn't the
way I approach these things.
>>> Define "important content" - especially as in light of the above?
>>
>> Well, for instance, articles, rss feed content, agenda items. But
>> excluding contactform information, for instance, or captions of site
>> elements, or an occasional quote perhaps.
>>
>
> Hmm, agenda items probably are going to be in their own tables with code
> to access those tables, aren't they? Maybe the same with contact info,
> quotes, etc.?
No. Like I said I guess you and I have a whole different view of how a CMS
might be build. The code to access agenda items for instance would simply be
in a php file stored in the filesystem, rather then in the DB. Even markup
code of these agenda items I would keep as much as possible out of the DB.
Only the plain text values (with, like I said an occasional HTML tag), or
integers and dates get stored in the DB.
> You can't build an index an an entire text field, anyway. Most limit
> index sizes, often to 256 bytes. And even then, they can only use the
> index when the search starts a word/phrase. For instance, you can search
> on "you are" in "You are here", but not in "Here you are". Indexes are of
> very limited use in such a case.
> To get around this, search engines often build tables of words and the
> pages they are in.
Ah yes, of course!! Thank you, that is the kind of input I really needed. I
knew there had to be pittfalls I was overlooking, or didn't have enough
knowledge of. And this is exactly such of a thing.
>> I know this argument doesn't hold up, since you of course will say that
>> it isn't done right then, but I can't remember a fluid design site which
>> design didn't totally mess up once you started playing with font size and
>> stuff.
>>
>
> You haven't looked around much, then. Of course there are limits, but
> they can be quite wide. Check out www.durpal.org, for instance. Or,
> www.sourceforge.net. Or any of millions of other sites on the net.
With all due respect Jerry, but then you and I are on a different page here
about graphic design. But enough of the fluid design issue, as far as I'm
concerned. It's not an issue for me, nor is it for the company's I build
sites for.
>> So, I take it you have experience with integrating google search? If so,
>> and if you don't mind elaborating, to what degree is the interface,
>> including the results customizable?
>>
>> Thank you in advance.
> I don't customize it - because my customers don't care if people see it's
> Google search. People are use to Google search anyway. And if it saves
> them hundreds or thousands of dollars, they quite happily show the Google
> results. In fact, they take pride in having it integrated into their
> site - even though it's only a few lines of HTML to implement.
Hmm, well I am about 99% sure that that will be a BIG no no in this case.
The project I'm talking about here is going to be for a cultural arthouse
that have their corporate identity high on their priority list, designed by
a designcompany that take pride in their visual design. With all due respect
here towards Google, but I don't think they want something as 'common' as a
Google search in their site, messing up said organisation's corporate
identity.
But then again, I might be in for a big surprise and find that they don't
really mind. I haven't suggested it yet.
Thanks again Jerry. | | | | | | | | Guest | Re: Search engine... engine -
06-04-2007, 01:37 AM
amygdala wrote:
>>> Like I tried to point out, apart from the occasional HTML tag, no, no PHP
>>> code.
>>>
>> And where are you going to put your search engine code, for instance? Or
>> are you going to have something like a catalog?
>>
>> I guess my biggest question is - if this is all going to be static pages,
>> why are you storing the data in a database in the first place?
>
> What makes you think it is going to be all static pages? I guess you have a
> whole different approach of how you build dynamic pages. I build them as
> sort of templates. So, these templates are stored on disk in the
> filesysytem, as regular php pages. And the dynamic part (I'm talking about
> the actual 'content' that is inserted in the DB by a CMS user here) is
> inserted in these templates through querying the DB. I don't see how this
> should involve PHP code being in the DB.
>
What you're describing is basically static pages. Just because they
reside on a database and are loaded via a templating system doesn't make
them any less static from the end-user's viewpoint.
Non-static pages would include things like shopping carts - which
respond to the user's input with different output.
>>>> Often times you need it - for instance, one I'm working on right now for
>>>> a non-profit uses PHP in the database to list members of chapters. And
>>>> you need the generated member list, not the source code, even though the
>>>> source code is being stored in the database.
>>> Are you implying here that you store complete instances of PHP cl***es in
>>> the DB? If, so... interesting approach. Again, one I wouldn't have come
>>> up with myself very quickly. BTW: serializing the objects by any chance?
>>>
>> Not necessarily entire cl***es. But code specific to those pages often is
>> stored in the database in many CMS's.
>
> I haven't had a close look at other CMS'es too much, but that just isn't the
> way I approach these things.
>
Then you basically have static pages, right?
>>>> Define "important content" - especially as in light of the above?
>>> Well, for instance, articles, rss feed content, agenda items. But
>>> excluding contactform information, for instance, or captions of site
>>> elements, or an occasional quote perhaps.
>>>
>> Hmm, agenda items probably are going to be in their own tables with code
>> to access those tables, aren't they? Maybe the same with contact info,
>> quotes, etc.?
>
> No. Like I said I guess you and I have a whole different view of how a CMS
> might be build. The code to access agenda items for instance would simply be
> in a php file stored in the filesystem, rather then in the DB. Even markup
> code of these agenda items I would keep as much as possible out of the DB.
> Only the plain text values (with, like I said an occasional HTML tag), or
> integers and dates get stored in the DB.
>
That's fine - but you need to get it into the page, right? And if the
CMS's don't have the code in the page itself, they at least have
placeholders in the code to indicate which code is to be loaded and
executed.
>> You can't build an index an an entire text field, anyway. Most limit
>> index sizes, often to 256 bytes. And even then, they can only use the
>> index when the search starts a word/phrase. For instance, you can search
>> on "you are" in "You are here", but not in "Here you are". Indexes are of
>> very limited use in such a case.
>> To get around this, search engines often build tables of words and the
>> pages they are in.
>
> Ah yes, of course!! Thank you, that is the kind of input I really needed. I
> knew there had to be pittfalls I was overlooking, or didn't have enough
> knowledge of. And this is exactly such of a thing.
>
>>> I know this argument doesn't hold up, since you of course will say that
>>> it isn't done right then, but I can't remember a fluid design site which
>>> design didn't totally mess up once you started playing with font size and
>>> stuff.
>>>
>> You haven't looked around much, then. Of course there are limits, but
>> they can be quite wide. Check out www.durpal.org, for instance. Or,
>> www.sourceforge.net. Or any of millions of other sites on the net.
>
> With all due respect Jerry, but then you and I are on a different page here
> about graphic design. But enough of the fluid design issue, as far as I'm
> concerned. It's not an issue for me, nor is it for the company's I build
> sites for.
>
>>> So, I take it you have experience with integrating google search? If so,
>>> and if you don't mind elaborating, to what degree is the interface,
>>> including the results customizable?
>>>
>>> Thank you in advance.
>> I don't customize it - because my customers don't care if people see it's
>> Google search. People are use to Google search anyway. And if it saves
>> them hundreds or thousands of dollars, they quite happily show the Google
>> results. In fact, they take pride in having it integrated into their
>> site - even though it's only a few lines of HTML to implement.
>
> Hmm, well I am about 99% sure that that will be a BIG no no in this case.
> The project I'm talking about here is going to be for a cultural arthouse
> that have their corporate identity high on their priority list, designed by
> a designcompany that take pride in their visual design. With all due respect
> here towards Google, but I don't think they want something as 'common' as a
> Google search in their site, messing up said organisation's corporate
> identity.
>
> But then again, I might be in for a big surprise and find that they don't
> really mind. I haven't suggested it yet.
>
> Thanks again Jerry.
>
>
If it's that important to them I suggest they get someone capable of
handling the job. Nothing against you, but a site search engine is
anything but a beginner's project.
--
==================
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp. EMAIL REMOVED
================== | | | | | Thread Tools | | | | Display Modes | Linear Mode |
Posting Rules
| You may not post new threads You may not post replies You may not post attachments You may not edit your posts HTML code is Off | | | |  |