Node cache implications with a very large sites
Windows Server Forum Index Windows Server
Server discussion on Windows platform.
 
 FAQFAQ   MemberlistMemberlist     RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 
 
Google
 
Web winserverhelp.com
Node cache implications with a very large sites

 
Post new topic   Reply to topic    Windows Server Forum Index -> CMS Server
Author Message
E R Doll
Guest





Posted: Thu Oct 20, 2005 8:51 pm    Post subject: Node cache implications with a very large sites Reply with quote

I am planning a large intranet CMS implementation and want to understand the
implications surrounding the node cache.

From other postings and documentation I understand:

1. Nodes are postings, templates, channels, galleries, gallery items.
2. The size in bytes of the node varies by the node type and the amount of
content in a node.
3. Resource nodes are only metadata, not content. Placeholder nodes have
content.
4. The physical limitation to be managed is the asp.net worker process
memory, which should not exceed 700-800 MB. The node cache makes up a
portion of this memory.
5. Effectively, that physical limit means approximately 35,000 nodes maximum
– give or take, depending on your average node size.
6. There are SQL queries to give a count of nodes in the system.
7. Some postings have references to sites with 100,000 nodes, so I assume
that MCMS can be used for sites with more than the approximately 35,000 node
cache limit.
8. When the node cache reaches the limit set in the SCA, nodes are dropped
to make room for newly requested nodes.
9. For performance and node cache efficiency objects in a container,
channels or galleries, should be limited to 300 items. I understand this
limit to a performance breakpoint, versus causing run-time errors.
10. Also, the root channel should have at most 15 sub-channels, as the ISAPI
filter enumerates/hydrates the root channel’s channel collection for every
page load. The number of objects in galleries and other channel is a less
critical consideration.
11. Keep the gallery / channel hierarchy less than 10 deep. I understand
this to be driven by how the node cache works and will impact performance in
retrieving information from the node cache and also searches retrievals.

My questions:

1. Are the seven points I gave above accurate?
2. I’m anticipating approximately 20,000 postings, to be migrated from the
existing intranet. Also there may be as many as 15,000 documents for
resource galleries. Some 1500 channels and 1500 resource galleries. So it’s
a reasonable guess that the system will have some 100,000 nodes, assuming an
average of three placeholders per template. So what problems should I expect
to see with the node cache set to a number around 35,000 which is much lower
than the number of potential nodes?
3. How is the node cache built? Does it build itself up based on the
posting and resources visited?
4. Are some nodes added to the node cache by default?
5. How does the node cache determine which nodes are dropped when it becomes
full?
6. Of course, not all pages and documents will be visited every day, some
only once per month. There will be a working-set of pages that are commonly
used. Is it reasonable to assume that good performance is possible if the
node cache holds the nodes for this working set?
7. Does ASP.NET page and fragment caching reduce the size needs for the node
cache?
8. Is there a scenario of node cache “thrashing” where the working set far
exceeds the node cache size and a great deal of processor time is needed to
add and delete nodes, and to adjust the cache data structure?

Thank you for any enlightenment on these questions.

E R Doll
Back to top
Stefan [MSFT]
Guest





Posted: Mon Oct 24, 2005 12:51 pm    Post subject: Re: Node cache implications with a very large sites Reply with quote

Hi,

1) the seven points are accurate mostly accurate except
- the 15 items for root. This can be ignored if you are not caring for
performance of non-MCMS related content on the box. So if the machine is
only a MCMS server machine and does not host any non-MCMS related websites -
you can ignore it.
- In addition rights groups are also nodes and the rights group container.
And there i
- Every posting is build off a page object and a posting object. So you need
to double the number of nodes for postings.

2) usually you will not see any problems. We have customers hosting several
million nodes in a database with good performance.
That comes due to the fact that not all nodes in the repository are hit with
the same frequency. Usually 35000 nodes are sufficient for good performance
as the number of your top most hit items usually does not exceed 17500.

3) yes.

4) no.

5) a random item is choosen.

6) yes.

7) yes. For most sites the performance win from output caching is higher
than from node caching as items served from output or fragement cache do not
hit MCMS at all.

8) this can only happen if hits to your site has are equally spread over all
items. In addition with output caching in place this will usually not
happen.
If it happens and affects the site performance you need to scale out the
application - means separate in different SQL servers and different
front-end machines.

Cheers,
Stefan

--
This posting is provided "AS IS" with no warranties, and confers no rights

New to MCMS?
Check out this book: Building Websites Using MCMS: http://tinyurl.com/6zj44
----------------------


"E R Doll" <E R Doll@discussions.microsoft.com> wrote in message
news:F2476EEF-4726-4974-B814-B128FA884606@microsoft.com...
Quote:
I am planning a large intranet CMS implementation and want to understand
the
implications surrounding the node cache.

From other postings and documentation I understand:

1. Nodes are postings, templates, channels, galleries, gallery items.
2. The size in bytes of the node varies by the node type and the amount of
content in a node.
3. Resource nodes are only metadata, not content. Placeholder nodes have
content.
4. The physical limitation to be managed is the asp.net worker process
memory, which should not exceed 700-800 MB. The node cache makes up a
portion of this memory.
5. Effectively, that physical limit means approximately 35,000 nodes
maximum
- give or take, depending on your average node size.
6. There are SQL queries to give a count of nodes in the system.
7. Some postings have references to sites with 100,000 nodes, so I assume
that MCMS can be used for sites with more than the approximately 35,000
node
cache limit.
8. When the node cache reaches the limit set in the SCA, nodes are dropped
to make room for newly requested nodes.
9. For performance and node cache efficiency objects in a container,
channels or galleries, should be limited to 300 items. I understand this
limit to a performance breakpoint, versus causing run-time errors.
10. Also, the root channel should have at most 15 sub-channels, as the
ISAPI
filter enumerates/hydrates the root channel's channel collection for every
page load. The number of objects in galleries and other channel is a less
critical consideration.
11. Keep the gallery / channel hierarchy less than 10 deep. I understand
this to be driven by how the node cache works and will impact performance
in
retrieving information from the node cache and also searches retrievals.

My questions:

1. Are the seven points I gave above accurate?
2. I'm anticipating approximately 20,000 postings, to be migrated from the
existing intranet. Also there may be as many as 15,000 documents for
resource galleries. Some 1500 channels and 1500 resource galleries. So
it's
a reasonable guess that the system will have some 100,000 nodes, assuming
an
average of three placeholders per template. So what problems should I
expect
to see with the node cache set to a number around 35,000 which is much
lower
than the number of potential nodes?
3. How is the node cache built? Does it build itself up based on the
posting and resources visited?
4. Are some nodes added to the node cache by default?
5. How does the node cache determine which nodes are dropped when it
becomes
full?
6. Of course, not all pages and documents will be visited every day, some
only once per month. There will be a working-set of pages that are
commonly
used. Is it reasonable to assume that good performance is possible if the
node cache holds the nodes for this working set?
7. Does ASP.NET page and fragment caching reduce the size needs for the
node
cache?
8. Is there a scenario of node cache "thrashing" where the working set far
exceeds the node cache size and a great deal of processor time is needed
to
add and delete nodes, and to adjust the cache data structure?

Thank you for any enlightenment on these questions.

E R Doll
Back to top
E R Doll
Guest





Posted: Mon Oct 24, 2005 4:51 pm    Post subject: Re: Node cache implications with a very large sites Reply with quote

Thanks Stefan,

Are you aware of any CMS installations that have 20,000 Resources in their
Resource galleries?

I suspect that 20,000 documents is "large" in terms of CMS implementations.
I'm curious whether we are treading into unknown territory.

Thanks,

E R Doll
Back to top
Stefan [MSFT]
Guest





Posted: Mon Oct 24, 2005 4:51 pm    Post subject: Re: Node cache implications with a very large sites Reply with quote

Hi,

I'm aware of installations with more than a million resource gallery items.

Cheers,
Stefan

--
This posting is provided "AS IS" with no warranties, and confers no rights

New to MCMS?
Check out this book: Building Websites Using MCMS: http://tinyurl.com/6zj44
----------------------


"E R Doll" <ERDoll@discussions.microsoft.com> wrote in message
news:223AA5E1-BA31-4948-BEBF-44BBB049B681@microsoft.com...
Quote:
Thanks Stefan,

Are you aware of any CMS installations that have 20,000 Resources in their
Resource galleries?

I suspect that 20,000 documents is "large" in terms of CMS
implementations.
I'm curious whether we are treading into unknown territory.

Thanks,

E R Doll
Back to top
E R Doll
Guest





Posted: Mon Oct 24, 2005 4:51 pm    Post subject: Re: Node cache implications with a very large sites Reply with quote

Stefan,

Thank you for the thorough reply. BTW, I'm looking forward to your new
book. Amazon says its coming out in November now.

Some additional questions:

The existing intranet we are migrating has a large number of documents
(.doc, .xls, .ppt, .pdf, etc.) . 15,000 to 20,000 which will be migrated.

The client anticipates growth, say doubling over the next few years after
the site live and being actively used.

I'm aware of the recommended 300 object limit for each Resource Gallery and
plan to organize the document in the resource galleries to be well below this
limit.

What performance considerations have people found with storing that number
of documents in the Resource Galleries?

Are there some best practices when dealing with a large number of resources,
such as flushing the disk cache periodically? Maximum size of disk cache?
etc?

Thank you for your help,

E R Doll
Back to top
Stefan [MSFT]
Guest





Posted: Mon Oct 24, 2005 4:51 pm    Post subject: Re: Node cache implications with a very large sites Reply with quote

Hi,

actually performance will decrease when the number of items exceeds the 300
item limit. The amount depends on your specific system.

You should never flush the disk cache. Doing this will decrease the
performance.
In addition you should set the disk cache size bigger than the sum of all
resources in all resource galleries to ensure that the disk cache can hold
all items without the need to invalidate them during operation.

Cheers,
Stefan

--
This posting is provided "AS IS" with no warranties, and confers no rights

New to MCMS?
Check out this book: Building Websites Using MCMS: http://tinyurl.com/6zj44
----------------------


"E R Doll" <ERDoll@discussions.microsoft.com> wrote in message
news:D1074E1B-DA94-4427-8CC4-8F369C4F51B2@microsoft.com...
Quote:
Stefan,

Thank you for the thorough reply. BTW, I'm looking forward to your new
book. Amazon says its coming out in November now.

Some additional questions:

The existing intranet we are migrating has a large number of documents
(.doc, .xls, .ppt, .pdf, etc.) . 15,000 to 20,000 which will be migrated.

The client anticipates growth, say doubling over the next few years after
the site live and being actively used.

I'm aware of the recommended 300 object limit for each Resource Gallery
and
plan to organize the document in the resource galleries to be well below
this
limit.

What performance considerations have people found with storing that number
of documents in the Resource Galleries?

Are there some best practices when dealing with a large number of
resources,
such as flushing the disk cache periodically? Maximum size of disk cache?
etc?

Thank you for your help,

E R Doll

Back to top
E R Doll
Guest





Posted: Mon Oct 24, 2005 8:51 pm    Post subject: Re: Node cache implications with a very large sites Reply with quote

Stefan,

Thanks for the information. Also, broader thanks for your contributions to
the CMS community (In case you don't hear it from others). I'm been doing
CMS projects since 2002 and am on my 13th CMS project. Your dedication to
help people out is invaluable to making the product a success.

As I was finishing up my recommendation that CMS is capable of scaling to
the size I allude to in the thread above. someone called my attention to this
thread:

http://msdn.microsoft.com/newsgroups/default.aspx?&query=hadi&lang=en&cr=US&guid=&sloc=en-us&dg=microsoft.public.cmserver.general&p=1&tid=a4ef9369-5cc0-4c9f-b087-0abcdd48a6d1&mid=2a4bd56d-5855-4013-b68b-9b8c0a5705e9

In it you say:

-------------
"MCMS is not a document management system.
Storing this amount of resources in resource galleries can become a huge
effort as resource galleries need to be created manually and one gallery is
not allowed to hold more than 300 items.

In addition the node cache would blow be blown up with these items.

So if possible - and if access should not be tightend based on MCMS user
roles - you should better store content on the file system. "
-------------

Are you addressing the 300 item limit here, or are you addressing the total
number of documents in the resource galleries. Or maybe just the
adminstrative effort to manage a larger resource gallery structure?

This posting seems to be in conflict with your response to my questions - so
I wanted to make sure I have everything clear.

Thanks,

E R Doll
Back to top
Stefan [MSFT]
Guest





Posted: Mon Oct 24, 2005 8:51 pm    Post subject: Re: Node cache implications with a very large sites Reply with quote

Hi,

for the manual resource gallery creation I'm addressing the 300 item limit.
But I also adressed the overall item limit.

As you can see: there is balance you need to go. Especially if you are
planning to store many huge items (more then 1 MB) you should better
consider to store them in the file system as the initial download from
repository to the disk cache after a restart can cause some delays
otherwise.

Cheers,
Stefan

--
This posting is provided "AS IS" with no warranties, and confers no rights

New to MCMS?
Check out this book: Building Websites Using MCMS: http://tinyurl.com/6zj44
----------------------


"E R Doll" <ERDoll@discussions.microsoft.com> wrote in message
news:68976254-3DBE-4DA9-9E05-0D51F04F15E6@microsoft.com...
Quote:
Stefan,

Thanks for the information. Also, broader thanks for your contributions
to
the CMS community (In case you don't hear it from others). I'm been
doing
CMS projects since 2002 and am on my 13th CMS project. Your dedication to
help people out is invaluable to making the product a success.

As I was finishing up my recommendation that CMS is capable of scaling to
the size I allude to in the thread above. someone called my attention to
this
thread:

http://msdn.microsoft.com/newsgroups/default.aspx?&query=hadi&lang=en&cr=US&guid=&sloc=en-us&dg=microsoft.public.cmserver.general&p=1&tid=a4ef9369-5cc0-4c9f-b087-0abcdd48a6d1&mid=2a4bd56d-5855-4013-b68b-9b8c0a5705e9

In it you say:

-------------
"MCMS is not a document management system.
Storing this amount of resources in resource galleries can become a huge
effort as resource galleries need to be created manually and one gallery
is
not allowed to hold more than 300 items.

In addition the node cache would blow be blown up with these items.

So if possible - and if access should not be tightend based on MCMS user
roles - you should better store content on the file system. "
-------------

Are you addressing the 300 item limit here, or are you addressing the
total
number of documents in the resource galleries. Or maybe just the
adminstrative effort to manage a larger resource gallery structure?

This posting seems to be in conflict with your response to my questions -
so
I wanted to make sure I have everything clear.

Thanks,

E R Doll






Back to top
E R Doll
Guest





Posted: Thu Oct 27, 2005 8:51 pm    Post subject: RE: Node cache implications with a very large sites Reply with quote

Stefan,

Two additional follow-up questions:

1) In other posts I have seen mentioned that exceeding the 300 object limit
for containers will not only cause performance problems when accessing that
particular container, but, also, will impact overall site performance. Can
you expand on that a bit? Why is the overall site performance reduced? Are
postings that don’t use the excessively large containers also impacted?
2) I couldn’t find this one in the documentation aware – Is it possible in a
load balanced production environment to point the disk cache to a shared
location for all servers (readonly servers)? Are there dependencies between
the node cache and disk cache?

Thanks for you help

E R Doll
Back to top
Stefan [MSFT]
Guest





Posted: Fri Oct 28, 2005 8:51 am    Post subject: Re: Node cache implications with a very large sites Reply with quote

Hi,

the reason is that when one user accesses this container the CPU time for
the thread handling this request will be reduced from requests going to
other parts of the site.

So if any request is going to such a channel then the overall site
performance for all other requests will also be reduced.
Hope this makes things clearer.

Cheers,
Stefan

--
This posting is provided "AS IS" with no warranties, and confers no rights

New to MCMS?
Check out this book: Building Websites Using MCMS: http://tinyurl.com/6zj44
----------------------


"E R Doll" <ERDoll@discussions.microsoft.com> wrote in message
news:51E281A5-9E99-41FB-B454-82D96817CB84@microsoft.com...
Quote:
Stefan,

Two additional follow-up questions:

1) In other posts I have seen mentioned that exceeding the 300 object
limit
for containers will not only cause performance problems when accessing
that
particular container, but, also, will impact overall site performance.
Can
you expand on that a bit? Why is the overall site performance reduced?
Are
postings that don't use the excessively large containers also impacted?
2) I couldn't find this one in the documentation aware - Is it possible in
a
load balanced production environment to point the disk cache to a shared
location for all servers (readonly servers)? Are there dependencies
between
the node cache and disk cache?

Thanks for you help

E R Doll
Back to top
 
Post new topic   Reply to topic    Windows Server Forum Index -> CMS Server All times are GMT
Page 1 of 1

 
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum




New Topics Powered by phpBB