Crawling Causes Issues...For Profile.php

As I have posted in the past the profile.php page has been an issue for my site when there has been traffic to the site.  The clever tracking of CloudFlare it is easy to track down that the constant crawling of Google and all the profile pages causes some kind of traffic loop that pegged the memory then that took over space on the server which then caused the site to come down.  Deano was so cleaver to point this out as a possible issue quite a time back.  I really had no idea it could have caused such and issue. Well done Deano 

I remember this being solved by someone once before and maybe with the update those settings were overwritten. My host says that the profile pages will not close correctly or end correctly which again caused the memory to peg and on from there to a site crash.

It is a dream to have Google crawling over 500,000 pages in less than a month but this circle of death with the profile.php is something that must be resolved both for future traffic and to allow the crawlers to access as many pages as possible...  Please advise on the solution for this issue...

Csampson
Quote · 14 Jun 2012

?Anyone?

Csampson
Quote · 15 Jun 2012

Using google's webmaster tools you can change how your site is crawled and setup limits of how much / fast they index your site.

Dolphin's caching options should help with large scans of the sites profiles, is at least the file cache enabled?. I wonder if this has anything to do with google's new evaluation of java-script causing 'phantom' ajax traffic for site bubble notifications or simple messenger.

CloudFlare is cool, but can cause issues if you don't redirect the ajax type requests to a 'direct' sub domain.

Also, CouldFlares T&C/AUP forbid using the CDN for videos so (if allowing video uploads)  you should update the video player file var to pull through a direct sub domain or another CDN.

 

Light man a fire keep him warm for a night, light him ON fire & he will be warm the rest of his life
Quote · 15 Jun 2012

 Is this not the same as very heavy traffic and why would it cause the site to crash.  The problem is the Profile.php which I deactivated is not closing properly and is at issue I really need some resolution for this and I did deactivate the profile.php page so it can keep the site up...Please advise...

See Below:

 

Using google's webmaster tools you can change how your site is crawled and setup limits of how much / fast they index your site.

 

Don't want to do that right now...

 

Dolphin's caching options should help with large scans of the sites profiles, is at least the file cache enabled?. I wonder if this has anything to do with google's new evaluation of java-script causing 'phantom' ajax traffic for site bubble notifications or simple messenger.

I have EAccelerator setup for cache

CloudFlare is cool, but can cause issues if you don't redirect the ajax type requests to a 'direct' sub domain.

Interesting and I will look into that in the AM

Also, CouldFlares T&C/AUP forbid using the CDN for videos so (if allowing video uploads)  you should update the video player file var to pull through a direct sub domain or another CDN.

 

I think I got an error for that exact thing and I will track that down as well thanks for the heads up...

 

 

Csampson
Quote · 15 Jun 2012

I created an account for the googlebot and let it have free rein as it crawls my site.


I set the googlebot account to private, so no one would friend it.

Now when google bot crawls, it's as if it's a normal user.  :)

There's some settings in the webmaster tools where you give it a url to login.

you can login using member.php

 

the setup ...

loginUrl is http://yourdomainhere/member.php

login method is POST

parameter 1 is ID  in caps!

parameter id 2 is Password  capitol P

 

view this page if it's confusing, as I have not had enough coffee this morning.  lol

http://www.mytikibar.com
Quote · 15 Jun 2012

 

I created an account for the googlebot and let it have free rein as it crawls my site.


I set the googlebot account to private, so no one would friend it.

Now when google bot crawls, it's as if it's a normal user.  :)

There's some settings in the webmaster tools where you give it a url to login.

you can login using member.php

 

the setup ...

loginUrl is http://yourdomainhere/member.php

login method is POST

parameter 1 is ID  in caps!

parameter id 2 is Password  capitol P

 

view this page if it's confusing, as I have not had enough coffee this morning.  lol

 Please help me understand this; your saying Google is "signing" into the site?

How do you know this? Can you actually see and confirm this?

IMO this makes no sense, google crawls my sites daily without ANY effect to the performance.

How can I tell when Google it there? My webcams tell me how many visitors are viewing, normal is 2-10 viewers for the Florida cams viewing, when Google crawls the views can spike to 200 or more. Going to my webcam server and doing a reverse IP check it is always Google crawling, when that happens going to SSH to run the "top" command the system load does not change much, let alone crash the site.

ManOfTeal.COM a Proud UNA site, six years running strong!
Quote · 15 Jun 2012

My issues is with the profiles page not closing or dieing who can help me with this very important code if indeed a profile page has an issue?

I am up on my math but there must be easily a million page iterations with what is possible to be crawled...

Csampson
Quote · 15 Jun 2012

This is one of those things I would have to see / poke in real time to come up with an idea as to why profile.php seems to be hanging. My guess is if we compare your profile.php to stock we will find a change or 50.

 

Light man a fire keep him warm for a night, light him ON fire & he will be warm the rest of his life
Quote · 16 Jun 2012

Well I put it under a code compare to the new 7.0.9 profile.php and they are identical...

 

Csampson
Quote · 16 Jun 2012

 

 

 Please help me understand this; your saying Google is "signing" into the site?

How do you know this? Can you actually see and confirm this?

IMO this makes no sense, google crawls my sites daily without ANY effect to the performance.

How can I tell when Google it there? My webcams tell me how many visitors are viewing, normal is 2-10 viewers for the Florida cams viewing, when Google crawls the views can spike to 200 or more. Going to my webcam server and doing a reverse IP check it is always Google crawling, when that happens going to SSH to run the "top" command the system load does not change much, let alone crash the site.

 Google signs into MY site because I gave it an account. 

MOST of my site is inaccessible unless you have and account, so google was complaining.

I made google's bot a private account and had to chase down ALL the code that ignores profile privacy to make sure when it is logged in, no one can see it.

That took a couple weeks.  LOL

Normally the google bot just scrapes your web pages, but if you don't allow access, it gets the ole "access denied" message, which I changed into a redirect to the login page.  Thus it was getting nothing usefull off my site.

http://www.mytikibar.com
Quote · 16 Jun 2012

 

I created an account for the googlebot and let it have free rein as it crawls my site.


I set the googlebot account to private, so no one would friend it.

Now when google bot crawls, it's as if it's a normal user.  :)

There's some settings in the webmaster tools where you give it a url to login.

you can login using member.php

 

the setup ...

loginUrl is http://yourdomainhere/member.php

login method is POST

parameter 1 is ID  in caps!

parameter id 2 is Password  capitol P

 

view this page if it's confusing, as I have not had enough coffee this morning.  lol

 

Could I please get an exact run-down on steps to accomplish this I'm a bit illiterate in this field.

Thank you.

Quote · 18 Oct 2012
 
 
Below is the legacy version of the Boonex site, maintained for Dolphin.Pro 7.x support.
The new Dolphin solution is powered by UNA Community Management System.