01 Mar 2015

PHP Startup Internals

History / Edit / PDF / EPUB / BIB / 6 min read (~1021 words)
php startup internals
  • PHP
  • MySQL
  • Jenkins
  • Apache/Nginx
  • Linux (Ubuntu)
  • node.js/io.js

My goal with this post (and any subsequent posts) is to share my thoughts and current practices on the topic of developing PHP applications in a startup environment.

Starting a new startup means making decisions. Which framework to choose, what tool to use, which programming language, what task should be done before this other task, etc.

Starting is often overwhelming. What should be done first? If we ignore all the questions about the business (what sector? any specific niche? what sort of product?), then the first thing that an individual or a team should aim for is to prepare for iteration.

Many would start by working directly on their first project. It makes sense since it is the primary goal of your startup to produce results. However, writing code without establishing some sort of workflow framework will be inefficient.

My first step is generally to setup Jenkins, a continuous integration tool. It allows me to setup automated testing and automated deployment to a development/staging area/environment. This is useful for two purposes:

  1. Having an external "party" execute the test in their own environment (separate from mine). This validates that whatever is in source control will work on someone else computer.
  2. It deploys automatically "stable" (in the sense that they pass testing) version to an online facing server. With automated deployment, it is possible for me to keep on writing code, have it tested and then deployed to a server where I can ask others to take a look at and provide feedback.

There are a couple of way to get setup.

Everything will be setup on the same machine. Here is how it basically goes:

  1. Install jenkins
  2. Create two jenkins jobs, project-name-develop which takes care of building the develop branch of your repository and run the tests (basic continuous integration), and project-name-develop-to-development, which will again, build the develop branch of your repository but this time for the purpose of having it available online.

There won't be much to discuss here except a list of plugins that are almost mandatory (either because they make jenkins much more useful or allow you to more quickly diagnose issues).

  • AnsiColor
  • Checkstyle Plug-in
  • Clover PHP plugin
  • Credentials Plugin
  • Duplicate Code Scanner Plug-in
  • GIT client plugin
  • GIT plugin
  • HTML Publisher plugin
  • JDepend Plugin
  • JUnit Plugin
  • Mailer Plugin
  • Matrix Authorization Strategy Plugin
  • Matrix Project Plugin
  • Node and Label parameter plugin
  • Parameterized Trigger plugin
  • Plot plugin
  • PMD Plug-in
  • Self-Organizing Swarm Plug-in Modules
  • Slack Notification Plugin
  • SSH Credentials Plugin
  • SSH Slaves plugin
  • Static Analysis Utilities
  • Throttle Concurrent Builds Plug-in
  • Timestamper
  • Violations plugin
  • xUnit plugin

I'll now go into more details as to what each does.

  1. Pull the latest revision from the repository
  2. Download and update composer (if required)
  3. Install dependencies
    1. bower install
    2. npm install
    3. composer install
  4. Build assets to validate they compile
    1. Compile LESS into CSS
    2. Concatenate and minify JS
  5. Prepare the application environment
    1. Migrate database
    2. Seed database
  6. Run continuous integration tools to assert code quality
    1. phpunit
    2. phploc
    3. pdepend
    4. phpmd
    5. phpcs
    6. phpcpd

An iterative cycle here should take less than 5 minutes (and a maximum of 30 minutes). The goal is to quickly know after pushing changes to your repository that nothing is broken.

For this to work, you simply need to make a symbolic link from the jenkins project workspace to some path which apache/nginx makes available to external users. For example

/home/jenkins/workspace/project-a-develop-to-development/public -> /var/www/development/project-a

  1. Pull the latest revision from the repository
  2. Download and update composer (if required)
  3. Install dependencies
    1. bower install
    2. npm install
    3. composer install
  4. Build/Prepare website
    1. Compile LESS into CSS
    2. Concatenate and minify JS
  5. Prepare the application environment
    1. Migrate database
    2. Seed database

An iterative cycle here should take less than 5 minutes. Anything that takes longer than that would be suspicious.

Now that you have both projects setup, here's how things work. First, project-name-develop is triggered every 1-5 minutes and checks the repository for changes. If changes are detected, a build starts and will verify that the current state of the code is valid.

Once the build finishes, if it is successful, projecy-name-develop-to-development will start (triggered on project-name-develop success). It will deploy the stable code so that users may test it.

A whole change cycle will generally take from 1 to 30 minutes depending on how many tests you have and how well you've been able to optimize your jenkins build workflow.

Here's a list of things to try/check:

  • If you are running phpunit with code coverage, disable it and run it in a separate jenkins project. Code coverage is 2-5x slower than without it. When you are running the tests, you want to know the results fast and code coverage should not be a priority. Speed is the priority.
  • If you are running tests against a database and the tests requires setting up and tearing down the database (either just truncating the tables or full DROP tables), search for ways to avoid hitting the database or how to improve performance. For example, if you are testing using SQLite, run an initial database migration and seeding and copy the resulting .sqlite file so that it can be copied on test setup instead of migrating/seeding every time.
  • If migrating/seeding takes a long time, keep the resulting .sqlite file and only rebuild it if its source files (dependencies) have changed. On a project, you will run tests much more often than you will be rebuilding the .sqlite file, so it is worth investing in developing such a tool.
  • Since php is single threaded, look for tools that will enable you to do multi-process php testing. An example of such tool is liuggio/fastest. Depending on the number of processors/cores you have available, you could see a 4-8x gain in speed.
  • If you have the money/hardware, distribute testing over many machines. If you want a unified phpunit code coverage/results, you can use phpcov to merge separate test results into a single result file.

The following depicts how I “solved” a problem I recently had regarding munin, its mysql plugins and the shared memory cache library used by the plugins (written in perl and using IPC::ShareLite).

First off, let’s begin with a description of the problem. I posted the following on serverfault.com in hope I’d get help from someone more experienced than I am.

I’ve recently setup a munin-node on a CentOS server. All was working fine until I tried to add the apache plugin (which works fine).

For some odd reason, the mysql plugins for munin that used to work ceased to work… I’m now getting a weird error whenever I’m running the plugin with munin-run. For instance

munin-run mysql_files_tables

returns me


IPC::ShareLite store() error: Identifier removed at /usr/lib/perl5/vendor_perl/5.8.8/Cache/SharedMemoryBackend.pm line 156

but sometimes it will also return


table_open_cache.value 64
Open_files.value 58
Open_tables.value 64
Opened_tables.value 19341

but after a while it will revert to the previous error.

I do not have any knowledge about the IPC or the ShareLite library so I don’t really know were to start looking. Since it is a module related to shared memory, I tried tracking down shared memory segments with ipcs without much success.

I haven’t yet rebooted the machine as it is used for many projects (I’d obviously like to be able to diagnose the problem without requiring a restart if it was possible).

Has anyone faced this problem? (a quick search on google didn’t present any relevant help)

Thanks for the help!

Obviously, one can see quickly that this is a quite specific question that not many may have actually encountered. Thus, I didn’t expect to receive much help out of it (and I didn’t).

I had left this issue on the side for a couple of days hoping to come back to it at some point. Munin and the mysql plugins were installed on two servers and it was working fine on both of them (and a third one as master node). After a minor change, one of two client nodes stopped working correctly while the other was still fine. After a couple of days though the second server also decided to exhibit a similar issue…

Tonight I remembered about strace, which is pretty awesome in circumstances like this one. I went ahead and launched strace munin-run mysql_files_tables which outputted a lot of stuff and then stopped at the following point:


...
ioctl(4, SNDCTL_TMR_TIMEBASE or TCGETS, 0x7fff13da8e30) = -1 ENOTTY (Inappropriate ioctl for device)
lseek(4, 0, SEEK_CUR)                   = 0
read(4, "# Carp::Heavy uses some variable"..., 4096) = 4096
brk(0x163e7000)                         = 0x163e7000
read(4, "\n    redo if $Internal{$caller};"..., 4096) = 1737
read(4, "", 4096)                       = 0
close(4)                                = 0
write(2, "IPC::ShareLite store() error: Id"..., 123IPC::ShareLite store() error: Identifier removed at /usr/lib/perl5/vendor_perl/5.8.8/Cache/SharedMemoryBackend.pm line 156
) = 123
semop(14581770, 0x2ab08bb67cf0, 3

and when it is actually fixed, the application would end instead (outputting a bunch of stuff such as the following)


...
stat("/usr/lib64/perl5/auto/Storable/_freeze.al", {st_mode=S_IFREG|0644, st_size=706, ...}) = 0
stat("/usr/lib64/perl5/auto/Storable/_freeze.al", {st_mode=S_IFREG|0644, st_size=706, ...}) = 0
open("/usr/lib64/perl5/auto/Storable/_freeze.al", O_RDONLY) = 4
ioctl(4, SNDCTL_TMR_TIMEBASE or TCGETS, 0x7fffe7223570) = -1 ENOTTY (Inappropriate ioctl for device)
lseek(4, 0, SEEK_CUR)                   = 0
read(4, "# NOTE: Derived from ../../lib/S"..., 4096) = 706
read(4, "", 4096)                       = 0
close(4)                                = 0
semop(917514, {{1, 0, 0}, {2, 0, 0}, {2, 1, SEM_UNDO}}, 3) = 0
semop(917514, {{2, -1, SEM_UNDO|IPC_NOWAIT}}, 1) = 0
semop(917514, {{1, 0, 0}, {2, 0, 0}, {2, 1, SEM_UNDO}}, 3) = 0
shmdt(0x7fc30021f000)                   = 0
semop(917514, {{2, -1, SEM_UNDO|IPC_NOWAIT}}, 1) = 0
...

What you can see in the first output above is pretty interesting. The semop call gives you the semid the process is trying to obtain (the semaphore used to synchronize different processes using the same shared memory). The signature of the semop function is as follow:


int semop(int semid, struct sembuf *sops, unsigned nsops);

where
semid: semaphore id
sops: pointer to a sembuf struct


struct sembuf {
    u_short sem_num; /* semaphore # */
    short   sem_op;  /* semaphore operation */
    short   sem_flg; /* operation flags */
};

nsops: the length of sops

Upon first inspection, you can see that the sembuf in the first case seems to be invalid if you compare it with the working version where it is actually resolved (strace displays something such as {{2, -1, SEM_UNDO|IPC_NOWAIT}} instead of 0x2ab08bb67cf0. But that is not helping me much.

With that semid you can do two things: first, you can check if it is still alive by calling ipcs, second, you can remove it with ipcrm -s semid.

In my case the “fix” itself was to remove the semaphore that the plugin wasn’t able to obtain (the reason of this still elude me though). After the removal of the semaphore, it is possible again to run munin correctly and the identifier removed error is gone.

I will have to do more research as to how/why this issue occurs as I’ve seen it happen only on CentOS machines so far (the master server is a Debian machine).

For some odd reason it seems that the system designed into TortoiseGit doesn't allow the user to interact with git when it requires user interaction. For instance, accepting self-signed certificated is not possible, which gives you the known issue 318.

As of TortoiseGit 1.8.5.0, it is still not possible to accept certificates through the GUI. But it is possible to get your git repository to work with TortoiseGit (and work with the required certificate).

You will need to have msysgit installed and available in your PATH for the following to work.

The first step is to run some git command, for instance git clone https://myserver.com/depot via a command line so that git may auto-accept (or ask you to accept) the certificate. This step is crucial to get the certificate details saved onto your machine.

What you will want to do next is go to C:\Program Files (x86)\Git\.subversion and copy everything into %USERPROFILE%\.subversion. Basically, this should copy the certicates that were accepted by msysgit so they can be used by TortoiseGit.

Another, and possibly better solution, is to create a symbolic link so that those 2 folders are in fact a single one. For instance, you could do something such as


move %USERPROFILE%\.subversion %USERPROFILE%\.subversion_backup
mklink /D %USERPROFILE%\.subversion "C:\Program Files (x86)\Git\.subversion"

which will make %USERPROFILE%\.subversion point to your C:\Program Files (x86)\Git\.subversion folder. This has the benefit that any future certificate will work both for msysgit and TortoiseGit.

Thanks to Mexx’ C# Corner for pointing out the solution.

As 2012 ended, I wanted to take a look back at this year and review my computer usage/consumption in order to reduce time wasting activity. That time should be channeled into more meaningful activities like learning a new language, improving my current skills, practicing piano and more.

The following data has been collected from August 11, 2012 to December 31, 2012. There is about 14 days which do not have any data (application was closed).

The data covers my computer usage with over 966.21h of active usage. During the period for which I collected data, the computer was also powered off for 1686.30h and left unused (away) for approximately 58.04h.

If we account 2 months of 31 days + 2 months of 30 days + 20 days (August) - 14 days without data = 128 days of data. This would average to 7.55h/day of active computer usage. The way it is currently "structured" however is that computer usage during the week is about 4-5h/day while on the week-end, it is about 12h/day.

This sounds a bit high, but there's a reason to this. I'm not ACTIVELY using the computer for all that time. In ManicTime, the computer is considered active if the computer is being used at least once within a 60 minutes time frame. This means I could potentially be using the computer for 1 minute (or less) every hour and it would count as an active usage of 1h. But for the sake of this review, I'll consider myself as a computer addict (which I am) and will count every minute as an active minute.

The following top 10 items accounts for 876.18h out of the 966.21h of active usage of the computer.

Application Hours
Google Chrome 457.76
Remote Desktop Connection 179.11
League of Legends (TM) Client 72.69
VLC media player 62.9
HexChat 35.81
Free Alarm Clock 15.02
mRemote 14.9
Sublime Text 2 14.56
Windows Explorer 12.13
Torchlight II 11.3

On first sight we can see that I spend a lot of time browsing the web. I do various things on there and since it is the biggest chunk of my time, it is worth looking at what I do exactly on the web. The following table is the top 10 websites I've spent time on.

Website Hours
www.reddit.com 165.53
www.jolteon.net 19.33
www.youtube.com 17.51
docs.google.com 12.79
-confidential- 12.66
www.twitch.tv 9.34
www.google.ca 8.18
en.wikipedia.org 4.92
www.facebook.com 4.81
mail.google.com 4.08

This covers 259.15h out of 457.76h (56.6%) spent in Chrome. This means that I have a long-tail (a list of many different websites which I visit for a brief period) of 198.61h. The major time consumer here is www.reddit.com, which accounts for 36.2% of my time browsing. Even though reddit is a news/media website (useful for staying up to date with world events, not sure I do that...), it also contains a lot of content which I would categorize as time wasters: funny pictures, pictures of cats, videos as well as discussions about topics of interests (computer science, software engineer, robotics, electronics, etc.). I spent about 1.3h/day during the 128 days for which I collected data, which I find to be quite a lot.

As for the other sites, here's a couple of notes:
www.jolteon.net: This is a bug tracker I use to track new features/bugs where I work. It serves as a personal system for me to track these issues. I use it frequently to update task statuses as well as enter anything that I may have forgotten to add during the day. I also like to review it frequently to remind myself of what is left to work on (and let my mind figure that out while I sleep)
www.youtube.com: I often get on youtube because of reddit. I enjoy watching documentaries which last from 30-45 minutes on average.
docs.google.com: I've spent some time writing documents in Google Docs simply because it allowed me to share them with others so they could review my work.
-confidential-: This is a website I use to manage "things" for work. I generally go there every day and it takes me from 5-15 minutes on average.
www.twitch.tv: Watching streamers of Starcraft 2 and LoL for a while.
www.google.ca: Looks like I spend a lot of time searching...
en.wikipedia.org: Whenever I don't know something about a subject of interest, wiki is a good source (generally...)
www.facebook.com: Checking that everyone I know is still alive
mail.google.com: Because I like spam

If we go back to the applications I use, the next in the list is Remote Desktop Connection. I use Remote Desktop Connection to connect to my PC at work so that I can do some work from home. As you can see, I have spent almost 1.44h/day working remotely. Considering that I am not a "work at home" employee, I find this to be outrageously high. I would like to see this be as close as possible to 0h/day.

Next is League of Legends (TM) Client. I've recently been interested in the game and started to play it on a more regular basis. I would like to keep this at around 1h/day or lower.

I've used VLC media player to watch series as well as movie on my PC. Series are 20-45 minutes while movies varies from 1h40 to 3h40. I'd say that about 225h/year looks like an acceptable amount of time spent on this.

I've stopped using MSN to chat with friends. My main communication channel is now through IRC, for which I use the HexChat client. I want to spend a maximum of 1h/day on communication though.

I'm not too sure why Free Alarm Clock is part of the top 10. I believe this has to do with the note I wrote at the beginning mentioning that ManicTime would consider a program active if there was some movement on the screen in the last 60 minutes. Since Free Alarm Clock was set to show up (and take focus) every hour, it is quite possible that it simply appeared from time to time while I was away and "sucked" the time out of whatever was running in the background.

mRemote is another application I used briefly to do remote desktop. Since it doesn't support multiple monitor remoting, I've stopped using it.

Sublime Text 2 is my text editor of choice. I haven't spent a lot of time in it since August mainly because I haven't been doing any coding at all in the past few months.

Windows Explorer Some time spent searching for files on my PC!

I've played through the whole campaign of Torchlight II, which was pretty awesome! I'd be really happy to try multiplayer with some willing friends to see what the end content is like (single player end of game content was pretty funny, but playing it alone wasn't very satisfying for me).

The important part of this process, other than reviewing what time was spent on this year, is to decide new objectives going forward. This means deciding what should be cut down, reduced, increased or added. For each application I've already determined what was my goal/limits thus I simply need to make sure I follow them. A monthly review should be sufficient.

  • Reduce reddit usage below 1h/day
  • Reduce/cut time spend doing remoting for work
  • Redistribute free time on learning activities and skills improvement
05 Jan 2013

Short SSD setup guide

History / Edit / PDF / EPUB / BIB / 1 min read (~141 words)
ssd

Backup everything. First rename, then remove when you confirmed it worked. Better safe than sorry.


REM Set the new root location
REM Ex. NEW_ROOT=D:
REM     NEW_ROOT=D:\Data
REM     NEW_ROOT=D:\MyStuff\MyComputer
set NEW_ROOT=D:

REM (Make sure that C:\Users doesn't exist anymore)
move C:\Users C:\Users.Backup
mklink /J C:\Users %NEW_ROOT%\Users

REM DO NOT DO THE FOLLOWING
REM (Rename C:\ProgramData before creating the link)
move C:\ProgramData C:\ProgramData.Backup
mklink /J C:\ProgramData %NEW_ROOT%\ProgramData

REM Update the TEMP folder at system level
setx /m TEMP %NEW_ROOT%\Windows\Temp
setx /m TMP %NEW_ROOT%\Windows\Temp

move C:\Windows\Temp C:\Windows\Temp.Backup
mklink /J C:\Windows\Temp %NEW_ROOT%\Windows\Temp
REM END DO NOT DO