2014-09-04
§
|
16:06 |
<bd808> |
Manually cleaned bogus LocalRenameUserJob jobs from redis |
[releng] |
13:54 |
<_joe_> |
stopped puppet on the appservers but mw03, testing an apache change |
[releng] |
05:28 |
<legoktm> |
stopping jobrunner on deployment-jobrunner01 |
[releng] |
05:22 |
<legoktm> |
restarted jobrunner on deployment-jobrunner01 |
[releng] |
05:14 |
<bd808> |
Bad jobs in job queue filled up /var on jobrunner01 and killed jobrunner script. Leaving down for now until I find out how to delete the bad jobs. |
[releng] |
01:41 |
<bd808> |
Killed old jobs-loop.sh processes on deployment-jobrunner01 |
[releng] |
01:24 |
<bd808> |
Many jobrunner errors like "wikiversions-labs.cdb has no version entry for `amwiki`" with various wiki names |
[releng] |
01:23 |
<bd808|AWAY> |
Started jobrunner service manually on jobrunner01. |
[releng] |
00:44 |
<bd808> |
Puppet run on deployment-jobrunner01 failing with what seem to be dns issues (getaddrinfo: Name or service not known when Trebuchet is running) |
[releng] |
00:35 |
<bd808> |
Puppet run on deployment-jobrunner01 failing with what seem to be dns issues (getaddrinfo: Name or service not known) |
[releng] |
2014-08-27
§
|
23:03 |
<hashar> |
Blacklisting the security audit IP again on deployment-cache bits01 mobile03 and text02 |
[releng] |
22:53 |
<hashar> |
removed the blackhole ip route from deployment-cache-text02 and deployment-cache-mobile03 |
[releng] |
22:48 |
<hashar> |
the IP is a known security audit. See Chris Steipp. |
[releng] |
22:46 |
<hashar> |
blackholed an IP address on deployment-cache-text02 and deployment-cache-mobile03 , it was causing hundred of requests per seconds and overloaded the beta cluster. Use route -n to find the IP |
[releng] |
22:37 |
<hashar> |
restarting udp2log-mw on deployment-bastion. It keeps crashing since fiarly recently |
[releng] |
22:26 |
<bd808> |
when restarting varnish on deployment-cache-text02, don't forget that there are 2 varnish services (varnish and varnish-frontend) |
[releng] |
22:19 |
<bd808> |
restarted varnish (again) on deployment-cache-text02 |
[releng] |
22:10 |
<bd808> |
restarted varnish on deployment-cache-text02 |
[releng] |
16:22 |
<bd808> |
killing `apt-get update` process running on deployment-bastion since Jun13 |
[releng] |
14:59 |
<bd808> |
Resolved puppet git merge conflict on deployment-salt |
[releng] |
14:49 |
<bd808> |
Moved hhvm core dumps to /data/project/hhvm-cores |
[releng] |
14:42 |
<bd808> |
Root dirve full on deployment-mediawiki02; hhvm core files are the culprit |
[releng] |